sda.api.graph_discovery ======================= .. py:module:: sda.api.graph_discovery .. autoapi-nested-parse:: SharePoint test-file discovery and download via MS Graph API. Entry point: :func:`download_test_file_via_graph`. Cache policy ------------ Downloaded files are stored in ``~/.sda/cache/{test_name}/``. * **Default** (``force_refresh=False``): if the cache directory is non-empty, the cached file is returned immediately — no network call is made. * **``force_refresh=True``**: the cache directory for that test is deleted and the file is re-downloaded from SharePoint. Use ``force_refresh=True`` when you know a test file has been updated on SharePoint and you need the latest version. Connection notes ---------------- A single :class:`httpx.Client` is created once per call to :func:`download_test_file_via_graph` and reused for all Graph API requests within that call (listing + any future batch work). The **download** step uses Graph's direct ``contentStream`` endpoint rather than the redirected SharePoint download URL. This keeps the transfer on the Graph host and avoids the extra redirect hop to SharePoint. Functions --------- .. autoapisummary:: sda.api.graph_discovery.list_tests_via_graph sda.api.graph_discovery.list_tests_via_graph_with_sites sda.api.graph_discovery.is_graph_available sda.api.graph_discovery.download_test_file_via_graph Module Contents --------------- .. py:function:: list_tests_via_graph(pattern = '*', verbose = False) List test IDs from SharePoint that match a fnmatch *pattern*. This is the cloud-native equivalent of :func:`sda.api.file_discovery.list_all_tests` when working with ``source="cloud"``. It queries the Graph API directly and never reads the local filesystem, so it works on machines that have no local SharePoint sync configured. For each site in :data:`~sda.api.graph_sharepoint_ids.SHAREPOINT_SITE_MAP` that has valid ``site_id`` and ``drive_id`` entries, the function lists the immediate sub-folders of that site's ``base_path``. Folder names that match *pattern* are collected and returned as sorted, deduplicated test IDs. :param pattern: fnmatch-style glob pattern, e.g. ``"T3*"`` or ``"*"``. Default ``"*"`` returns every test folder found across all sites. :type pattern: :py:class:`str`, *optional* :param verbose: If ``True``, print progress information for each site queried. Default ``False``. :type verbose: :py:class:`bool`, *optional* :returns: Sorted list of matching test IDs, e.g. ``["T301", "T302", "T344"]``. :rtype: :py:class:`list[str]` :raises RuntimeError: If authentication fails. .. rubric:: Examples List every T3* test directly from SharePoint: >>> from sda.api.graph_discovery import list_tests_via_graph >>> tests = list_tests_via_graph("T3*") >>> print(tests) ['T301', 'T302', ...] .. seealso:: :func:`download_test_file_via_graph` Download a single test file via Graph. .. py:function:: list_tests_via_graph_with_sites(pattern = '*', verbose = False) Like :func:`list_tests_via_graph` but also returns the source site name. :param pattern: fnmatch-style glob pattern, e.g. ``"T3*"`` or ``"*"``. :type pattern: :py:class:`str`, *optional* :param verbose: If ``True``, print progress information for each site queried. :type verbose: :py:class:`bool`, *optional* :returns: Mapping of ``{test_name: site_name}`` for every matching test ID, sorted by test name. Only canonical test IDs are included (non-test SharePoint folders are skipped via :func:`_is_test_id`). :rtype: :py:class:`dict[str`, :py:class:`str]` .. py:function:: is_graph_available(site_name) Return ``True`` if ``site_name`` has a Graph API mapping. :param site_name: A ``DATA_SHAREPOINTS`` key, e.g. ``"6. DATA 2025"``. :type site_name: :py:class:`str` .. py:function:: download_test_file_via_graph(test_name, datafilename_filter = '*.xls*', force_refresh = False, cache_dir = None) Download a test file from SharePoint via MS Graph API. :param test_name: Canonical test identifier, e.g. ``"T297"``. Must not be a direct file path or a ``CUSTOM_TESTS`` entry (both raise :class:`ValueError`). :type test_name: :py:class:`str` :param datafilename_filter: Glob pattern passed to :meth:`~sda.api.graph_client.GraphClient.list_folder_children` to identify the target file within the remote folder. Default ``"*.xls*"``. :type datafilename_filter: :py:class:`str`, *optional* :param force_refresh: If ``True``, delete the local cache for ``test_name`` and re-download. Default ``False``. :type force_refresh: :py:class:`bool`, *optional* :param cache_dir: Root directory for the local cache. Defaults to ``~/.sda/cache/``. :type cache_dir: :py:class:`Path`, *optional* :returns: Local path to the downloaded file (not the folder). :rtype: :py:class:`Path` :raises ValueError: If ``test_name`` is a direct path, a ``CUSTOM_TESTS`` entry, or if the resolved SharePoint site is not in :data:`~sda.api.graph_sharepoint_ids.SHAREPOINT_SITE_MAP` (i.e. the site/drive IDs have not been filled in yet). :raises FileNotFoundError: If no file matching ``datafilename_filter`` is found in the remote folder. :raises RuntimeError: If authentication fails.