sda.api.load#

Functions#

`select_parser`(data_file, folder_path)	Select the appropriate parser function for a given data file.
`load_test`(test_name[, datafilename_filter, ...])	Load data from a specific test by name or path.
`parse_files`(files[, command, columns_to_keep, ...])	Parse multiple files using the unified parser approach.
`load_tests`([test_names, include_2021_2022, ...])	Load multiple tests by names or discover all available tests.

Module Contents#

sda.api.load.select_parser(data_file, folder_path)#

Select the appropriate parser function for a given data file.

Unified approach: Always use parse_test_files (supports ExcelFileHandler with XML Discovery + Polars for all files)

See also

parse_test_files(): Unified parser used across all years.

sda.api.load.load_test(test_name, datafilename_filter='*.xls*', data_sharepoint='auto', command=None, columns_to_keep=None, verbose=1, column_not_found='raise', table_not_found='raise', suppress_polars_warnings=False, **kwargs)#

Load data from a specific test by name or path.

This is a high-level function that handles the complete workflow: path resolution, file discovery, and data parsing using the unified parser approach.

Parameters:

test_name (str | Path) – Name of the test file to load, ex. T135, or direct path to file.
datafilename_filter (str, optional) – Filter to apply to the data files, by default “.xls” (matches .xlsx, .xlsm, .xls).
data_sharepoint (str, optional) – Name of data sharepoint where the test is located, by default “auto”.
command (dict, optional) – Legacy reading commands. Now largely ignored in favor of automatic Excel table detection.
columns_to_keep (list, optional) – List of columns to keep (default: None).
verbose (int, default 1) – Verbosity level.
column_not_found (str, default 'raise') – What to do if a column is not found.
table_not_found (str, default 'raise') – What to do if no Excel tables are found in a file. Can be ‘raise’ or ‘warn’.
suppress_polars_warnings (bool, default False) – If True, suppress Polars dtype warning messages during reading.
**kwargs – Additional arguments passed to the parser function.

Returns:

pandas.DataFrame containing the test data with automatic table detection.

Return type:

pandas.DataFrame

Examples

Load by test name:

>>> from sda.api.load import load_test
>>> df = load_test("T183")

Load with specific filter:

>>> df = load_test("T183", datafilename_filter="*_processed.xlsx")

Load from direct path:

>>> df = load_test("/path/to/data/T183_experiment.xlsx")

See also

resolve_test_path(): Resolve a test name or path to the canonical folder path.
discover_data_file(): Find a data file in the resolved test folder using a filename pattern.
parse_test_files(): Unified parser for all supported years (2021+).
parse_files(): Parse one or many files directly when you already know the paths.

sda.api.load.parse_files(files, command=None, columns_to_keep=None, verbose=1, column_not_found='warn', table_not_found='raise', **kwargs)#

Parse multiple files using the unified parser approach.

Since all files now use the same unified parser (parse_test_files with ExcelFileHandler), this function directly calls that parser without any grouping logic.

Parameters:

files (str, Path, list, or dict) – Files to parse. Same format as parse_test_files (unified approach).
command (dict, optional) – Default command for reading files. If None, will use parser-specific defaults.
columns_to_keep (list, optional) – List of columns to keep (only applies to 2023+ data).
verbose (int, default 1) – Verbosity level.
column_not_found (str, default 'warn') – What to do if a column is not found.
table_not_found (str, default 'raise') – What to do if no Excel tables are found in a file. Can be ‘raise’ or ‘warn’.
**kwargs – Additional arguments passed to parser functions.

Returns:

pandas.DataFrame combined from all files with automatic table detection.

Return type:

pandas.DataFrame

Examples

Parse a single file:

>>> from sda.api.load import parse_files
>>> df = parse_files("T183.xlsx")

Parse multiple files:

>>> files = ["T183.xlsx", "T196.xlsx"]
>>> df = parse_files(files)

Parse with legacy command format:

>>> files = {"data.xlsx": {}}  # Empty dict uses table detection
>>> df = parse_files(files)

See also

parse_test_files(): Unified parser function used under the hood.
list_all_files(): Discover eligible files to parse inside a test folder.
load_test(): High-level convenience wrapper to load a single test by name.

sda.api.load.load_tests(test_names=None, include_2021_2022=True, include_2023_current=True)#

Load multiple tests by names or discover all available tests.

This function provides a convenient way to load multiple test datasets at once, with automatic discovery if no specific test names are provided.

Parameters:

test_names (list, optional) – List of test names to load. If None, discovers all available tests.
include_2021_2022 (bool, default True) – Whether to include 2021-2022 test data in discovery.
include_2023_current (bool, default True) – Whether to include 2023-current test data in discovery.

Returns:

pandas.DataFrame combined from all loaded tests with automatic table detection.

Return type:

pandas.DataFrame

Examples

Load specific tests:

>>> from sda.api.load import load_tests
>>> df = load_tests(["T183", "T196"])

Load all available tests:

>>> df = load_tests()  # Discovers and loads all tests

Load only recent tests:

>>> df = load_tests(include_2021_2022=False)

See also

list_all_tests(): Discover available test identifiers.
load_test(): Load a single test.
parse_files(): Parse multiple files directly.

sda.api.load#

Functions#

Module Contents#

This Page