sda.api.load#
Functions#
|
Select the appropriate parser function for a given data file. |
|
Load data from a specific test by name or path. |
|
Parse multiple files using the unified parser approach. |
|
Load multiple tests by names or discover all available tests. |
Module Contents#
- sda.api.load.select_parser(data_file, folder_path)#
Select the appropriate parser function for a given data file.
Unified approach: Always use parse_test_files (supports ExcelFileHandler with XML Discovery + Polars for all files)
See also
parse_test_files()Unified parser used across all years.
- sda.api.load.load_test(test_name, datafilename_filter='*.xls*', data_sharepoint='auto', command=None, columns_to_keep=None, verbose=1, column_not_found='raise', table_not_found='raise', suppress_polars_warnings=False, **kwargs)#
Load data from a specific test by name or path.
This is a high-level function that handles the complete workflow: path resolution, file discovery, and data parsing using the unified parser approach.
- Parameters:
test_name (
str | Path) – Name of the test file to load, ex.T135, or direct path to file.datafilename_filter (
str, optional) – Filter to apply to the data files, by default “.xls” (matches .xlsx, .xlsm, .xls).data_sharepoint (
str, optional) – Name of data sharepoint where the test is located, by default “auto”.command (
dict, optional) – Legacy reading commands. Now largely ignored in favor of automatic Excel table detection.columns_to_keep (
list, optional) – List of columns to keep (default: None).verbose (
int, default1) – Verbosity level.column_not_found (
str, default'raise') – What to do if a column is not found.table_not_found (
str, default'raise') – What to do if no Excel tables are found in a file. Can be ‘raise’ or ‘warn’.suppress_polars_warnings (
bool, defaultFalse) – If True, suppress Polars dtype warning messages during reading.**kwargs – Additional arguments passed to the parser function.
- Returns:
pandas.DataFramecontaining the test data with automatic table detection.- Return type:
Examples
Load by test name:
>>> from sda.api.load import load_test >>> df = load_test("T183")
Load with specific filter:
>>> df = load_test("T183", datafilename_filter="*_processed.xlsx")
Load from direct path:
>>> df = load_test("/path/to/data/T183_experiment.xlsx")
See also
resolve_test_path()Resolve a test name or path to the canonical folder path.
discover_data_file()Find a data file in the resolved test folder using a filename pattern.
parse_test_files()Unified parser for all supported years (2021+).
parse_files()Parse one or many files directly when you already know the paths.
- sda.api.load.parse_files(files, command=None, columns_to_keep=None, verbose=1, column_not_found='warn', table_not_found='raise', **kwargs)#
Parse multiple files using the unified parser approach.
Since all files now use the same unified parser (parse_test_files with ExcelFileHandler), this function directly calls that parser without any grouping logic.
- Parameters:
files (
str,Path,list, ordict) – Files to parse. Same format as parse_test_files (unified approach).command (
dict, optional) – Default command for reading files. If None, will use parser-specific defaults.columns_to_keep (
list, optional) – List of columns to keep (only applies to 2023+ data).verbose (
int, default1) – Verbosity level.column_not_found (
str, default'warn') – What to do if a column is not found.table_not_found (
str, default'raise') – What to do if no Excel tables are found in a file. Can be ‘raise’ or ‘warn’.**kwargs – Additional arguments passed to parser functions.
- Returns:
pandas.DataFramecombined from all files with automatic table detection.- Return type:
Examples
Parse a single file:
>>> from sda.api.load import parse_files >>> df = parse_files("T183.xlsx")
Parse multiple files:
>>> files = ["T183.xlsx", "T196.xlsx"] >>> df = parse_files(files)
Parse with legacy command format:
>>> files = {"data.xlsx": {}} # Empty dict uses table detection >>> df = parse_files(files)
See also
parse_test_files()Unified parser function used under the hood.
list_all_files()Discover eligible files to parse inside a test folder.
load_test()High-level convenience wrapper to load a single test by name.
- sda.api.load.load_tests(test_names=None, include_2021_2022=True, include_2023_current=True)#
Load multiple tests by names or discover all available tests.
This function provides a convenient way to load multiple test datasets at once, with automatic discovery if no specific test names are provided.
- Parameters:
- Returns:
pandas.DataFramecombined from all loaded tests with automatic table detection.- Return type:
Examples
Load specific tests:
>>> from sda.api.load import load_tests >>> df = load_tests(["T183", "T196"])
Load all available tests:
>>> df = load_tests() # Discovers and loads all tests
Load only recent tests:
>>> df = load_tests(include_2021_2022=False)
See also
list_all_tests()Discover available test identifiers.
load_test()Load a single test.
parse_files()Parse multiple files directly.