The Payne¶

The Payne is a neural network-based spectral emulator that estimates stellar labels by fitting a forward model to observed APOGEE spectra. Unlike pipelines that directly predict labels from spectra, The Payne uses a trained neural network as a fast spectral emulator and optimizes labels to best match the observed spectrum.

What it does¶

The Payne estimates a comprehensive set of stellar labels from APOGEE spectra:

Stellar parameters: Teff, log g, microturbulence (v_turb), macroturbulence (v_macro)
Chemical abundances: [C/H], [N/H], [O/H], [Na/H], [Mg/H], [Al/H], [Si/H], [P/H], [S/H], [K/H], [Ca/H], [Ti/H], [V/H], [Cr/H], [Mn/H], [Fe/H], [Co/H], [Ni/H], [Cu/H], [Ge/H]
Isotope ratio: 12C/13C
Radial velocity: v_rel (if enabled via v_rad_tolerance)
Fit quality: chi-squared and reduced chi-squared

How it works¶

Neural network emulator¶

The Payne uses a simple feedforward neural network with two hidden layers and leaky ReLU activations to emulate stellar spectra. Given a set of stellar labels, the network rapidly predicts the corresponding spectrum:

input labels -> Dense + Leaky ReLU -> Dense + Leaky ReLU -> Dense -> predicted spectrum

The network weights and biases are loaded from a pre-trained model (payne_apogee_nn.pkl). The labels are internally scaled to a [0, 1] range using pre-computed x_min and x_max values.

Fitting procedure¶

For each observed spectrum, The Payne:

Continuum-normalizes the observed spectrum using a Chebyshev polynomial (degree 4 by default) fit across three APOGEE detector regions:
- 15,100 – 15,793 Angstroms
- 15,880 – 16,417 Angstroms
- 16,499 – 17,000 Angstroms
Interpolates the observed spectrum onto the model wavelength grid.
Optimizes the stellar labels using scipy.optimize.curve_fit with the Trust Region Reflective (TRF) method, minimizing the chi-squared difference between the model and observed spectrum.
Estimates uncertainties from the covariance matrix of the fit. Correlation coefficients between all pairs of labels are also computed and stored.

Radial velocity¶

Radial velocity fitting is controlled by the v_rad_tolerance parameter (default: 0, meaning disabled). When enabled, an additional label representing the radial velocity is optimized simultaneously with the stellar labels, and the model spectrum is Doppler-shifted accordingly.

Pixel masking¶

A mask (payne_apogee_mask.npy) can be applied to exclude certain spectral pixels from the fit. Masked pixels have their inverse variance set to zero.

Noise model¶

A post-hoc noise model is applied to adjust formal uncertainties. The raw (formal) uncertainties from the fit covariance matrix are stored as raw_e_* fields, while the corrected uncertainties (using empirical calibration from ThePayne_corrections.pkl) are stored as e_* fields.

Output fields¶

Stellar parameters¶

Field	Type	Description
`teff`	float	Effective temperature (K)
`e_teff`	float	Uncertainty in Teff
`logg`	float	Surface gravity (log10(cm/s^2))
`e_logg`	float	Uncertainty in log g
`v_turb`	float	Microturbulent velocity (km/s)
`e_v_turb`	float	Uncertainty in v_turb
`v_macro`	float	Macroturbulent velocity (km/s)
`e_v_macro`	float	Uncertainty in v_macro
`v_rel`	float	Relative radial velocity (km/s)

Chemical abundances¶

Each abundance is reported as [X/H] in dex:

Fields	Element
`c_h`, `e_c_h`	Carbon
`n_h`, `e_n_h`	Nitrogen
`o_h`, `e_o_h`	Oxygen
`na_h`, `e_na_h`	Sodium
`mg_h`, `e_mg_h`	Magnesium
`al_h`, `e_al_h`	Aluminum
`si_h`, `e_si_h`	Silicon
`p_h`, `e_p_h`	Phosphorus
`s_h`, `e_s_h`	Sulfur
`k_h`, `e_k_h`	Potassium
`ca_h`, `e_ca_h`	Calcium
`ti_h`, `e_ti_h`	Titanium
`v_h`, `e_v_h`	Vanadium
`cr_h`, `e_cr_h`	Chromium
`mn_h`, `e_mn_h`	Manganese
`fe_h`, `e_fe_h`	Iron
`co_h`, `e_co_h`	Cobalt
`ni_h`, `e_ni_h`	Nickel
`cu_h`, `e_cu_h`	Copper
`ge_h`, `e_ge_h`	Germanium
`c12_c13`, `e_c12_c13`	Carbon isotope ratio (12C/13C)

All labels also have raw_e_* counterparts storing the formal uncertainties before noise model correction.

Fit quality and metadata¶

Field	Type	Description
`chi2`	float	Chi-squared of the best fit
`reduced_chi2`	float	Reduced chi-squared
`result_flags`	bitmask	Bitfield encoding quality flags

Correlation coefficients¶

Pairwise correlation coefficients between all labels are stored as rho_<label1>_<label2> fields (e.g., rho_teff_logg, rho_teff_fe_h). These are derived from the covariance matrix of the fit.

Spectral data¶

Field	Type	Description
`wavelength`	array	Wavelength array (log-lambda spaced, 8575 pixels)
`model_flux`	array	Best-fit model flux (continuum x rectified model)
`continuum`	array	Fitted continuum

These spectral data arrays are stored in intermediate pickle files and loaded on demand.

Flags¶

Flag	Bit	Description
`flag_fitting_failure`	2^0	The fitting procedure failed
`flag_warn_teff`	2^1	Teff < 3,100 K or Teff > 7,900 K
`flag_warn_logg`	2^2	log g < 0.1 or log g > 5.2
`flag_warn_fe_h`	2^3	[Fe/H] > 0.4 or [Fe/H] < -1.4
`flag_low_snr`	2^4	S/N < 70

Summary flags¶

flag_warn: Set when any flag bit is non-zero (i.e., result_flags > 0).
flag_bad: Set only when flag_fitting_failure is set.

Caveats¶

The Payne is a forward-modeling approach: it uses a neural network to emulate spectra and optimizes labels to fit the data. This is fundamentally different from pipelines like AstroNN or APOGEENet, which directly predict labels from spectra.
The effective training range is approximately 3,100 – 7,900 K in Teff, 0.1 – 5.2 in log g, and -1.4 to +0.4 in [Fe/H]. Results outside these ranges are flagged.
The optimization uses bounded parameters (labels are internally constrained to [0, 1] in normalized space). This means the optimizer cannot extrapolate far beyond the training grid.
Continuum normalization uses a 4th-order Chebyshev polynomial fit with a fixed pixel mask. Poor continuum fits (e.g., for emission-line stars or heavily reddened spectra) can propagate into biased label estimates.
The model and continuum spectral arrays are stored as intermediate pickle files, not in the database directly. These files are loaded lazily when the model_flux or continuum attributes are accessed.
The formal uncertainties from the covariance matrix tend to underestimate true uncertainties, which is why a post-hoc noise model correction is applied.