# Installation

This page provides comprehensive installation instructions for SDA (Spark Data Access/Analysis).

## Installation Options

### Option 1: Dedicated Environment

Install `sda` in a dedicated environment:

```bash
git clone https://github.com/spark-cleantech/sda.git
cd sda  # go in parent sda folder
conda env create --file .\environment.yml
conda activate sda
pip install -e .         # install current folder (`.`) in editable mode
```

### Option 2: Existing Environment

Install `sda` in an existing environment, for instance (below) `spy`:

Assuming you already have created the [spy](https://github.com/spark-cleantech/spy) Anaconda environment, clone the repository and install `sda`:

```bash
git clone https://github.com/spark-cleantech/sda.git
cd sda
conda activate spy
conda env update --name spy --file .\environment.yml    # Update Spy environment with packages from Sda
pip install -e .         # install current folder (`.`) in editable mode
```

## Verification

Check that the install went smoothly. Open a Terminal/Console, and run:

```bash
conda activate sda     # or 'spy' if you installed in 'spy' environment
sda list
```

If SDA was installed properly, this will list all Data Files available on your local machine. You are now all set to start using SDA!

## Update Package

Pull the latest version from GitHub. Update the dependencies, reinstall:

```bash
cd sda
git pull
conda env update --name spy --file .\environment.yml
pip install -e .
```

## Developer Setup

For developers and contributors, follow these additional steps:

### Prerequisites

Clone the repository:

```bash
git clone https://github.com/spark-cleantech/sda.git
cd sda
```

For best experience create a new conda environment (e.g. sda-env) with Python 3.11:

```bash
conda create -n sda-env -c conda-forge python=3.11 -y
conda activate sda-env
```

### Development Workflow

Before pushing to GitHub, run the following commands:

1. Update conda environment: `make conda-env-update`
1. Install this package in editable mode: `pip install -e .`
1. (optional) Sync with the latest [template](https://github.com/spark-cleantech/package-template): `make template-update`
1. (optional) Run quality assurance checks (code linting): `make qa`
1. (optional) Run tests: `make unit-tests`
1. (optional) Run the static type checker: `make type-check`
1. (optional) Build the documentation: `make docs-build`

### Cross-Platform Notes

If using Windows, `make` is not available by default. Either install it
([for instance with Chocolatey](https://stackoverflow.com/questions/32127524/how-to-install-and-use-make-in-windows)),
or open the [Makefile](./Makefile) and execute the lines therein manually.

## Testing

The SDA project uses a comprehensive testing strategy with separated test suites:

### Test Categories

#### Unit Tests

Fast-running tests that don't require browser automation:

```bash
# Run unit tests only (excludes E2E tests)
make unit-tests

# Run with coverage report
make unit-tests COV_REPORT=html
```

#### End-to-End (E2E) Tests

Browser-based tests using Playwright for complete workflow validation:

```bash
# Run E2E tests only
make e2e-tests

# Run E2E tests with detailed output
python -m pytest -vv -m "e2e" --tb=short
```

### Running All Tests

```bash
# Run both unit and E2E tests
python -m pytest tests/

# Run with specific markers
python -m pytest -m "not slow"  # Skip slow tests
```

### Test Structure

- **Unit Tests**: `tests/` directory, any file not ending with `_e2e.py`
- **E2E Tests**: `tests/dashboard/*_e2e.py` files, automatically marked with `@pytest.mark.e2e`
- **Test Configuration**: `tests/conftest.py` and `tests/dashboard/conftest.py`

### CI/CD Testing

The CI pipeline runs tests in two separate jobs:

1. **Unit Tests Job**: Fast feedback (no browser dependencies)
1. **E2E Tests Job**: Comprehensive testing with browser automation (runs after unit tests pass)

This approach provides:

- Faster feedback for most common development scenarios
- Efficient resource utilization in CI
- Clear separation of concerns between test types

### Development Testing

For development, you can run specific test files:

```bash
# Test specific functionality
python -m pytest tests/dashboard/test_pipeline_pure.py -v

# Test with browser automation
python -m pytest tests/dashboard/test_scatter_filter_integration_e2e.py -v

# Run with live browser (for debugging)
python -m pytest tests/dashboard/test_dashboard_e2e.py -v --headed
```