Database¶
Astra uses Peewee as its ORM. All models inherit from BaseModel (defined in src/astra/models/base.py), which binds them to a shared database connection.
Connection setup¶
The database connection is established at import time in src/astra/models/base.py by the get_database_and_schema function. The resolution order is:
ASTRA_DATABASE_PATHenvironment variable – If set, uses a SQLite database at that path.TESTINGflag in config – IfTrue, uses an in-memory SQLite database (:memory:).Config file – Reads
databasefrom the Astra config file (loaded bysdsstools.get_config). Supports both PostgreSQL (dbnamekey) and SQLite (pathkey).Fallback – If nothing is configured, defaults to an in-memory SQLite database with a warning.
Astra looks for configuration files in the standard sdsstools locations:
~/.config/sdss/astra/astra.yml(user override)src/astra/etc/astra.yml(package default)
PostgreSQL in production¶
In production (at Utah/CHPC), Astra uses PostgreSQL with a schema (e.g., astra_043). The BaseModel.Meta.schema attribute is set from the config. Astra wraps the PostgreSQL driver with retry logic (ResilientDatabase) to handle transient errors like deadlocks and connection drops.
SQLite for development¶
For local development and testing, SQLite is simpler. WAL journaling mode, a 64 MB cache, and synchronous=0 are set by default for performance.
Core tables¶
Source¶
Defined in src/astra/models/source.py. One row per astronomical source. Key columns:
pk– Auto-incrementing primary key.sdss_id– The SDSS-V identifier (unique).gaia_dr3_source_id,tic_v8_id, etc. – Cross-match identifiers.Astrometric, photometric, and targeting columns.
Spectrum¶
Defined in src/astra/models/spectrum.py. A base table for all spectrum types. The SpectrumMixin adds computed properties like .e_flux and .plot().
Concrete spectrum types (ApogeeCoaddedSpectrumInApStar, BossVisitSpectrum, BossCombinedSpectrum, etc.) extend Spectrum and add instrument-specific columns and pixel-data accessors.
PipelineOutputMixin¶
Defined in src/astra/models/pipeline.py. The base class for every pipeline result table. Provides:
Column |
Type |
Purpose |
|---|---|---|
|
AutoField |
Primary key, auto-assigned on insert. |
|
ForeignKey(Source) |
Link to the source. |
|
ForeignKey(Spectrum) |
Link to the input spectrum. |
|
Integer |
Astra version encoded as an integer (e.g., |
|
DateTime |
When the row was first created. |
|
DateTime |
When the row was last updated. |
|
Float |
Seconds spent on this spectrum (analysis time). |
|
Float |
Seconds of overhead (model loading, I/O). |
|
Text |
Free-form label for organising runs. |
A generated column v_astra_major_minor (= v_astra / 1000) is used in the unique constraint (spectrum_pk, v_astra_major_minor). This means only one result per spectrum per major.minor version is kept; re-running with the same version updates the existing row.
Upsert behaviour¶
The bulk_insert_or_replace_pipeline_results function in src/astra/__init__.py performs batched upserts using PostgreSQL’s ON CONFLICT ... DO UPDATE (or SQLite equivalent). When a conflict occurs on (spectrum_pk, v_astra_major_minor), all fields except task_pk and created are overwritten. This means:
The first run for a given version inserts new rows.
Subsequent runs update existing rows in place.
The original
createdtimestamp is preserved;modifiedis updated.
Custom fields¶
Astra defines several custom Peewee field types in src/astra/fields.py:
BitField– Stores multiple boolean flags in a single integer column. Individual flags are accessed as properties (e.g.,result.flag_low_snr). Each flag is defined withresult_flags.flag(2**N, help_text="...").PixelArray– A virtual field (not stored in the database) that lazily loads per-pixel arrays (flux, ivar, wavelength) from FITS files. Uses accessor classes to control how data is loaded.LogLambdaArrayAccessor– Generates a wavelength array fromcrval,cdelt, andnaxisparameters instead of reading from disk.
Creating tables¶
In tests or local development, you can create tables manually:
from astra.models.source import Source
from astra.models.spectrum import Spectrum
from astra.models.rocket import Rocket
for model in (Source, Spectrum, Rocket):
model.create_table()
In production, table creation is managed by database migrations.
Querying results¶
Peewee’s query API applies directly:
from astra.models.corv import Corv
# Get all results with teff > 10000
hot = Corv.select().where(Corv.teff > 10000)
# Join with Source to filter by sdss_id
from astra.models.source import Source
result = (
Corv
.select()
.join(Source, on=(Corv.source_pk == Source.pk))
.where(Source.sdss_id == 12345678)
.first()
)