# The Payne The Payne is a neural network-based spectral emulator that estimates stellar labels by fitting a forward model to observed APOGEE spectra. Unlike pipelines that directly predict labels from spectra, The Payne uses a trained neural network as a fast spectral emulator and optimizes labels to best match the observed spectrum. ## What it does The Payne estimates a comprehensive set of stellar labels from APOGEE spectra: - **Stellar parameters**: Teff, log g, microturbulence (v_turb), macroturbulence (v_macro) - **Chemical abundances**: [C/H], [N/H], [O/H], [Na/H], [Mg/H], [Al/H], [Si/H], [P/H], [S/H], [K/H], [Ca/H], [Ti/H], [V/H], [Cr/H], [Mn/H], [Fe/H], [Co/H], [Ni/H], [Cu/H], [Ge/H] - **Isotope ratio**: 12C/13C - **Radial velocity**: v_rel (if enabled via `v_rad_tolerance`) - **Fit quality**: chi-squared and reduced chi-squared ## How it works ### Neural network emulator The Payne uses a simple feedforward neural network with two hidden layers and leaky ReLU activations to emulate stellar spectra. Given a set of stellar labels, the network rapidly predicts the corresponding spectrum: ``` input labels -> Dense + Leaky ReLU -> Dense + Leaky ReLU -> Dense -> predicted spectrum ``` The network weights and biases are loaded from a pre-trained model (`payne_apogee_nn.pkl`). The labels are internally scaled to a [0, 1] range using pre-computed `x_min` and `x_max` values. ### Fitting procedure For each observed spectrum, The Payne: 1. **Continuum-normalizes** the observed spectrum using a Chebyshev polynomial (degree 4 by default) fit across three APOGEE detector regions: - 15,100 -- 15,793 Angstroms - 15,880 -- 16,417 Angstroms - 16,499 -- 17,000 Angstroms 2. **Interpolates** the observed spectrum onto the model wavelength grid. 3. **Optimizes** the stellar labels using `scipy.optimize.curve_fit` with the Trust Region Reflective (TRF) method, minimizing the chi-squared difference between the model and observed spectrum. 4. **Estimates uncertainties** from the covariance matrix of the fit. Correlation coefficients between all pairs of labels are also computed and stored. ### Radial velocity Radial velocity fitting is controlled by the `v_rad_tolerance` parameter (default: 0, meaning disabled). When enabled, an additional label representing the radial velocity is optimized simultaneously with the stellar labels, and the model spectrum is Doppler-shifted accordingly. ### Pixel masking A mask (`payne_apogee_mask.npy`) can be applied to exclude certain spectral pixels from the fit. Masked pixels have their inverse variance set to zero. ### Noise model A post-hoc noise model is applied to adjust formal uncertainties. The raw (formal) uncertainties from the fit covariance matrix are stored as `raw_e_*` fields, while the corrected uncertainties (using empirical calibration from `ThePayne_corrections.pkl`) are stored as `e_*` fields. ## Output fields ### Stellar parameters | Field | Type | Description | |-------|------|-------------| | `teff` | float | Effective temperature (K) | | `e_teff` | float | Uncertainty in Teff | | `logg` | float | Surface gravity (log10(cm/s^2)) | | `e_logg` | float | Uncertainty in log g | | `v_turb` | float | Microturbulent velocity (km/s) | | `e_v_turb` | float | Uncertainty in v_turb | | `v_macro` | float | Macroturbulent velocity (km/s) | | `e_v_macro` | float | Uncertainty in v_macro | | `v_rel` | float | Relative radial velocity (km/s) | ### Chemical abundances Each abundance is reported as [X/H] in dex: | Fields | Element | |--------|---------| | `c_h`, `e_c_h` | Carbon | | `n_h`, `e_n_h` | Nitrogen | | `o_h`, `e_o_h` | Oxygen | | `na_h`, `e_na_h` | Sodium | | `mg_h`, `e_mg_h` | Magnesium | | `al_h`, `e_al_h` | Aluminum | | `si_h`, `e_si_h` | Silicon | | `p_h`, `e_p_h` | Phosphorus | | `s_h`, `e_s_h` | Sulfur | | `k_h`, `e_k_h` | Potassium | | `ca_h`, `e_ca_h` | Calcium | | `ti_h`, `e_ti_h` | Titanium | | `v_h`, `e_v_h` | Vanadium | | `cr_h`, `e_cr_h` | Chromium | | `mn_h`, `e_mn_h` | Manganese | | `fe_h`, `e_fe_h` | Iron | | `co_h`, `e_co_h` | Cobalt | | `ni_h`, `e_ni_h` | Nickel | | `cu_h`, `e_cu_h` | Copper | | `ge_h`, `e_ge_h` | Germanium | | `c12_c13`, `e_c12_c13` | Carbon isotope ratio (12C/13C) | All labels also have `raw_e_*` counterparts storing the formal uncertainties before noise model correction. ### Fit quality and metadata | Field | Type | Description | |-------|------|-------------| | `chi2` | float | Chi-squared of the best fit | | `reduced_chi2` | float | Reduced chi-squared | | `result_flags` | bitmask | Bitfield encoding quality flags | ### Correlation coefficients Pairwise correlation coefficients between all labels are stored as `rho__` fields (e.g., `rho_teff_logg`, `rho_teff_fe_h`). These are derived from the covariance matrix of the fit. ### Spectral data | Field | Type | Description | |-------|------|-------------| | `wavelength` | array | Wavelength array (log-lambda spaced, 8575 pixels) | | `model_flux` | array | Best-fit model flux (continuum x rectified model) | | `continuum` | array | Fitted continuum | These spectral data arrays are stored in intermediate pickle files and loaded on demand. ## Flags | Flag | Bit | Description | |------|-----|-------------| | `flag_fitting_failure` | 2^0 | The fitting procedure failed | | `flag_warn_teff` | 2^1 | Teff < 3,100 K or Teff > 7,900 K | | `flag_warn_logg` | 2^2 | log g < 0.1 or log g > 5.2 | | `flag_warn_fe_h` | 2^3 | [Fe/H] > 0.4 or [Fe/H] < -1.4 | | `flag_low_snr` | 2^4 | S/N < 70 | ### Summary flags - **`flag_warn`**: Set when any flag bit is non-zero (i.e., `result_flags > 0`). - **`flag_bad`**: Set only when `flag_fitting_failure` is set. ## Caveats - The Payne is a forward-modeling approach: it uses a neural network to emulate spectra and optimizes labels to fit the data. This is fundamentally different from pipelines like AstroNN or APOGEENet, which directly predict labels from spectra. - The effective training range is approximately 3,100 -- 7,900 K in Teff, 0.1 -- 5.2 in log g, and -1.4 to +0.4 in [Fe/H]. Results outside these ranges are flagged. - The optimization uses bounded parameters (labels are internally constrained to [0, 1] in normalized space). This means the optimizer cannot extrapolate far beyond the training grid. - Continuum normalization uses a 4th-order Chebyshev polynomial fit with a fixed pixel mask. Poor continuum fits (e.g., for emission-line stars or heavily reddened spectra) can propagate into biased label estimates. - The model and continuum spectral arrays are stored as intermediate pickle files, not in the database directly. These files are loaded lazily when the `model_flux` or `continuum` attributes are accessed. - The formal uncertainties from the covariance matrix tend to underestimate true uncertainties, which is why a post-hoc noise model correction is applied.