Database¶

Astra uses Peewee as its ORM. All models inherit from BaseModel (defined in src/astra/models/base.py), which binds them to a shared database connection.

Connection setup¶

The database connection is established at import time in src/astra/models/base.py by the get_database_and_schema function. The resolution order is:

ASTRA_DATABASE_PATH environment variable – If set, uses a SQLite database at that path.
TESTING flag in config – If True, uses an in-memory SQLite database (:memory:).
Config file – Reads database from the Astra config file (loaded by sdsstools.get_config). Supports both PostgreSQL (dbname key) and SQLite (path key).
Fallback – If nothing is configured, defaults to an in-memory SQLite database with a warning.

Astra looks for configuration files in the standard sdsstools locations:

~/.config/sdss/astra/astra.yml (user override)
src/astra/etc/astra.yml (package default)

PostgreSQL in production¶

In production (at Utah/CHPC), Astra uses PostgreSQL with a schema (e.g., astra_043). The BaseModel.Meta.schema attribute is set from the config. Astra wraps the PostgreSQL driver with retry logic (ResilientDatabase) to handle transient errors like deadlocks and connection drops.

SQLite for development¶

For local development and testing, SQLite is simpler. WAL journaling mode, a 64 MB cache, and synchronous=0 are set by default for performance.

Core tables¶

`Source`¶

Defined in src/astra/models/source.py. One row per astronomical source. Key columns:

pk – Auto-incrementing primary key.
sdss_id – The SDSS-V identifier (unique).
gaia_dr3_source_id, tic_v8_id, etc. – Cross-match identifiers.
Astrometric, photometric, and targeting columns.

`Spectrum`¶

Defined in src/astra/models/spectrum.py. A base table for all spectrum types. The SpectrumMixin adds computed properties like .e_flux and .plot().

Concrete spectrum types (ApogeeCoaddedSpectrumInApStar, BossVisitSpectrum, BossCombinedSpectrum, etc.) extend Spectrum and add instrument-specific columns and pixel-data accessors.

`PipelineOutputMixin`¶

Defined in src/astra/models/pipeline.py. The base class for every pipeline result table. Provides:

Column	Type	Purpose
`task_pk`	AutoField	Primary key, auto-assigned on insert.
`source_pk`	ForeignKey(Source)	Link to the source.
`spectrum_pk`	ForeignKey(Spectrum)	Link to the input spectrum.
`v_astra`	Integer	Astra version encoded as an integer (e.g., `0.8.1` becomes `8001`).
`created`	DateTime	When the row was first created.
`modified`	DateTime	When the row was last updated.
`t_elapsed`	Float	Seconds spent on this spectrum (analysis time).
`t_overhead`	Float	Seconds of overhead (model loading, I/O).
`tag`	Text	Free-form label for organising runs.

A generated column v_astra_major_minor (= v_astra / 1000) is used in the unique constraint (spectrum_pk, v_astra_major_minor). This means only one result per spectrum per major.minor version is kept; re-running with the same version updates the existing row.

Upsert behaviour¶

The bulk_insert_or_replace_pipeline_results function in src/astra/__init__.py performs batched upserts using PostgreSQL’s ON CONFLICT ... DO UPDATE (or SQLite equivalent). When a conflict occurs on (spectrum_pk, v_astra_major_minor), all fields except task_pk and created are overwritten. This means:

The first run for a given version inserts new rows.
Subsequent runs update existing rows in place.
The original created timestamp is preserved; modified is updated.

Custom fields¶

Astra defines several custom Peewee field types in src/astra/fields.py:

BitField – Stores multiple boolean flags in a single integer column. Individual flags are accessed as properties (e.g., result.flag_low_snr). Each flag is defined with result_flags.flag(2**N, help_text="...").
PixelArray – A virtual field (not stored in the database) that lazily loads per-pixel arrays (flux, ivar, wavelength) from FITS files. Uses accessor classes to control how data is loaded.
LogLambdaArrayAccessor – Generates a wavelength array from crval, cdelt, and naxis parameters instead of reading from disk.

Creating tables¶

In tests or local development, you can create tables manually:

from astra.models.source import Source
from astra.models.spectrum import Spectrum
from astra.models.rocket import Rocket

for model in (Source, Spectrum, Rocket):
    model.create_table()

In production, table creation is managed by database migrations.

Querying results¶

Peewee’s query API applies directly:

from astra.models.corv import Corv

# Get all results with teff > 10000
hot = Corv.select().where(Corv.teff > 10000)

# Join with Source to filter by sdss_id
from astra.models.source import Source
result = (
    Corv
    .select()
    .join(Source, on=(Corv.source_pk == Source.pk))
    .where(Source.sdss_id == 12345678)
    .first()
)