astra.contrib.thecannon

Package Contents

Classes

CannonModel A model for The Cannon which includes L1 regularization and pixel censoring.
class astra.contrib.thecannon.CannonModel(training_set_labels, training_set_flux, training_set_ivar, vectorizer, dispersion=None, regularization=None, censors=None, **kwargs)

A model for The Cannon which includes L1 regularization and pixel censoring.

Parameters:
  • training_set_labels – A set of objects with labels known to high fidelity. This can be given as a numpy structured array, or an astropy table.
  • training_set_flux – An array of normalised fluxes for stars in the labelled set, given as shape (num_stars, num_pixels). The num_stars should match the number of rows in training_set_labels.
  • training_set_ivar – An array of inverse variances on the normalized fluxes for stars in the training set. The shape of the training_set_ivar array should match that of training_set_flux.
  • vectorizer – A vectorizer to take input labels and produce a design matrix. This should be a sub-class of vectorizer.BaseVectorizer.
  • dispersion – [optional] The dispersion values corresponding to the given pixels. If provided, this should have a size of num_pixels.
  • regularization – [optional] The strength of the L1 regularization. This should either be None, a float-type value for single regularization strength for all pixels, or a float-like array of length num_pixels.
  • censors – [optional] A dictionary containing label names as keys and boolean censoring masks as values.
_data_attributes = ['training_set_labels', 'training_set_flux', 'training_set_ivar']
_descriptive_attributes = ['vectorizer', 'censors', 'regularization', 'dispersion']
_trained_attributes = ['theta', 's2']
training_set_labels

Return the labels in the training set.

training_set_flux

Return the training set fluxes.

training_set_ivar

Return the inverse variances of the training set fluxes.

vectorizer

Return the vectorizer for this model.

design_matrix

Return the design matrix for this model.

theta

Return the theta coefficients (spectral model derivatives).

s2

Return the intrinsic variance (s^2) for all pixels.

censors

Return the wavelength censor masks for the labels.

dispersion

Return the dispersion points for all pixels.

regularization

Return the strength of the L1 regularization for this model.

is_trained

Return true or false for whether the model is trained.

__str__(self)

Return str(self).

__repr__(self)

Return repr(self).

_censored_design_matrix(self, pixel_index, fill_value=np.nan)

Return a censored design matrix for the given pixel index, and a mask of which theta values to ignore when fitting.

Parameters:pixel_index – The zero-indexed pixel.
Returns:A two-length tuple containing the censored design mask for this pixel, and a boolean mask of values to exclude when fitting for the spectral derivatives.
reset(self)

Clear any attributes that have been trained.

_pixel_access(self, array, index, default=None)

Safely access a (potentially per-pixel) attribute of the model.

Parameters:
  • array – Either None, a float value, or an array the size of the dispersion array.
  • index – The zero-indexed pixel to attempt to access.
  • default – [optional] The default value to return if array is None.
_verify_training_data(self, rho_warning=0.9)

Verify the training data for the appropriate shape and content.

Parameters:rho_warning – [optional] Maximum correlation value between labels before a warning is given.
in_convex_hull(self, labels)

Return whether the provided labels are inside a complex hull constructed from the labelled set.

Parameters:labels – A NxK array of N sets of K labels, where K is the number of labels that make up the vectorizer.
Returns:A boolean array as to whether the points are in the complex hull of the labelled set.
write(self, path, include_training_set_spectra=False, overwrite=False, protocol=-1)

Serialise the trained model and save it to disk. This will save all relevant training attributes, and optionally, the training data.

Parameters:
  • path – The path to save the model to.
  • include_training_set_spectra – [optional] Save the labelled set, normalised flux and inverse variance used to train the model.
  • overwrite – [optional] Overwrite the existing file path, if it already exists.
  • protocol – [optional] The Python pickling protocol to employ. Use 2 for compatibility with previous Python releases, -1 for performance.
classmethod read(cls, path, **kwargs)

Read a saved model from disk.

Parameters:path – The path where to load the model from.
train(self, threads=None, op_method=None, op_strict=True, op_kwds=None, **kwargs)

Train the model.

Parameters:
  • threads – [optional] The number of parallel threads to use.
  • op_method – [optional] The optimization algorithm to use: l_bfgs_b (default) and powell are available.
  • op_strict – [optional] Default to Powell’s optimization method if BFGS fails.
  • op_kwds – Keyword arguments to provide directly to the optimization function.
Returns:

A three-length tuple containing the spectral coefficients theta, the squared scatter term at each pixel s2, and metadata related to the training of each pixel.

__call__(self, labels)

Return spectral fluxes, given the labels.

Parameters:labels – An array of stellar labels.
test(self, flux, ivar, dispersion=None, initial_labels=None, initialisations=1, threads=None, use_derivatives=True, op_kwds=None, **kwargs)

Run the test step on spectra.

Parameters:
  • flux – The (pseudo-continuum-normalized) spectral flux.
  • ivar – The inverse variance values for the spectral fluxes.
  • dispersion – [optional] The dispersion values for the given flux and inverse variances. If given, then the model will be interpolated to these dispersion values at runtime.
  • initial_labels – [optional] The initial labels to try for each spectrum. This can be a single set of initial values, or one set of initial values for each star.
  • initialisations – [optional] The number of initial starting points to use, based on percentiles of the labels in the training set. This is ignored if`initial_labels` is given.
  • threads – [optional] The number of parallel threads to use.
  • use_derivatives – [optional] Boolean True indicating to use analytic derivatives provided by the vectorizer, None to calculate on the fly, or a callable function to calculate your own derivatives.
  • op_kwds – [optional] Optimization keywords that get passed to scipy.optimize.leastsq.
_initial_theta(self, pixel_index, **kwargs)

Return a list of guesses of the spectral coefficients for the given pixel index. Initial values are sourced in the following preference order:

  1. a previously trained theta value for this pixel,
  2. an estimate of theta using linear algebra,
  3. a neighbouring pixel’s theta value,
  4. the fiducial value of [1, 0, …, 0].
Parameters:pixel_index – The zero-indexed integer of the pixel.
Returns:A list of initial theta guesses, and the source of each guess.