sda.api.graph_discovery#

SharePoint test-file discovery and download via MS Graph API.

Entry point: download_test_file_via_graph().

Cache policy#

Downloaded files are stored in ~/.sda/cache/{test_name}/.

  • Default (force_refresh=False): if the cache directory is non-empty, the cached file is returned immediately — no network call is made.

  • ``force_refresh=True``: the cache directory for that test is deleted and the file is re-downloaded from SharePoint.

Use force_refresh=True when you know a test file has been updated on SharePoint and you need the latest version.

Connection notes#

A single httpx.Client is created once per call to download_test_file_via_graph() and reused for all Graph API requests within that call (listing + any future batch work).

The download step uses Graph’s direct contentStream endpoint rather than the redirected SharePoint download URL. This keeps the transfer on the Graph host and avoids the extra redirect hop to SharePoint.

Functions#

list_tests_via_graph([pattern, verbose])

List test IDs from SharePoint that match a fnmatch pattern.

list_tests_via_graph_with_sites([pattern, verbose])

Like list_tests_via_graph() but also returns the source site name.

is_graph_available(site_name)

Return True if site_name has a Graph API mapping.

download_test_file_via_graph(test_name[, ...])

Download a test file from SharePoint via MS Graph API.

Module Contents#

sda.api.graph_discovery.list_tests_via_graph(pattern='*', verbose=False)#

List test IDs from SharePoint that match a fnmatch pattern.

This is the cloud-native equivalent of sda.api.file_discovery.list_all_tests() when working with source="cloud". It queries the Graph API directly and never reads the local filesystem, so it works on machines that have no local SharePoint sync configured.

For each site in SHAREPOINT_SITE_MAP that has valid site_id and drive_id entries, the function lists the immediate sub-folders of that site’s base_path. Folder names that match pattern are collected and returned as sorted, deduplicated test IDs.

Parameters:
  • pattern (str, optional) – fnmatch-style glob pattern, e.g. "T3*" or "*". Default "*" returns every test folder found across all sites.

  • verbose (bool, optional) – If True, print progress information for each site queried. Default False.

Returns:

Sorted list of matching test IDs, e.g. ["T301", "T302", "T344"].

Return type:

list[str]

Raises:

RuntimeError – If authentication fails.

Examples

List every T3* test directly from SharePoint:

>>> from sda.api.graph_discovery import list_tests_via_graph
>>> tests = list_tests_via_graph("T3*")
>>> print(tests)
['T301', 'T302', ...]

See also

download_test_file_via_graph()

Download a single test file via Graph.

sda.api.graph_discovery.list_tests_via_graph_with_sites(pattern='*', verbose=False)#

Like list_tests_via_graph() but also returns the source site name.

Parameters:
  • pattern (str, optional) – fnmatch-style glob pattern, e.g. "T3*" or "*".

  • verbose (bool, optional) – If True, print progress information for each site queried.

Returns:

Mapping of {test_name: site_name} for every matching test ID, sorted by test name. Only canonical test IDs are included (non-test SharePoint folders are skipped via _is_test_id()).

Return type:

dict[str, str]

sda.api.graph_discovery.is_graph_available(site_name)#

Return True if site_name has a Graph API mapping.

Parameters:

site_name (str) – A DATA_SHAREPOINTS key, e.g. "6. DATA 2025".

sda.api.graph_discovery.download_test_file_via_graph(test_name, datafilename_filter='*.xls*', force_refresh=False, cache_dir=None)#

Download a test file from SharePoint via MS Graph API.

Parameters:
  • test_name (str) – Canonical test identifier, e.g. "T297". Must not be a direct file path or a CUSTOM_TESTS entry (both raise ValueError).

  • datafilename_filter (str, optional) – Glob pattern passed to list_folder_children() to identify the target file within the remote folder. Default "*.xls*".

  • force_refresh (bool, optional) – If True, delete the local cache for test_name and re-download. Default False.

  • cache_dir (Path, optional) – Root directory for the local cache. Defaults to ~/.sda/cache/.

Returns:

Local path to the downloaded file (not the folder).

Return type:

Path

Raises:
  • ValueError – If test_name is a direct path, a CUSTOM_TESTS entry, or if the resolved SharePoint site is not in SHAREPOINT_SITE_MAP (i.e. the site/drive IDs have not been filled in yet).

  • FileNotFoundError – If no file matching datafilename_filter is found in the remote folder.

  • RuntimeError – If authentication fails.