`astra.contrib.thecannon`¶

Subpackages¶

Submodules¶

Package Contents¶

Classes¶

CannonModel A model for The Cannon which includes L1 regularization and pixel censoring.

class astra.contrib.thecannon.CannonModel(training_set_labels, training_set_flux, training_set_ivar, vectorizer, dispersion=None, regularization=None, censors=None, **kwargs)¶

A model for The Cannon which includes L1 regularization and pixel censoring.

Parameters:

Parameters:	training_set_labels – A set of objects with labels known to high fidelity. This can be given as a numpy structured array, or an astropy table. training_set_flux – An array of normalised fluxes for stars in the labelled set, given as shape `(num_stars, num_pixels)`. The `num_stars` should match the number of rows in `training_set_labels`. training_set_ivar – An array of inverse variances on the normalized fluxes for stars in the training set. The shape of the `training_set_ivar` array should match that of `training_set_flux`. vectorizer – A vectorizer to take input labels and produce a design matrix. This should be a sub-class of `vectorizer.BaseVectorizer`. dispersion – [optional] The dispersion values corresponding to the given pixels. If provided, this should have a size of `num_pixels`. regularization – [optional] The strength of the L1 regularization. This should either be `None`, a float-type value for single regularization strength for all pixels, or a float-like array of length `num_pixels`. censors – [optional] A dictionary containing label names as keys and boolean censoring masks as values.

training_set_labels – A set of objects with labels known to high fidelity. This can be given as a numpy structured array, or an astropy table.
training_set_flux – An array of normalised fluxes for stars in the labelled set, given as shape (num_stars, num_pixels). The num_stars should match the number of rows in training_set_labels.
training_set_ivar – An array of inverse variances on the normalized fluxes for stars in the training set. The shape of the training_set_ivar array should match that of training_set_flux.
vectorizer – A vectorizer to take input labels and produce a design matrix. This should be a sub-class of vectorizer.BaseVectorizer.
dispersion – [optional] The dispersion values corresponding to the given pixels. If provided, this should have a size of num_pixels.
regularization – [optional] The strength of the L1 regularization. This should either be None, a float-type value for single regularization strength for all pixels, or a float-like array of length num_pixels.
censors – [optional] A dictionary containing label names as keys and boolean censoring masks as values.

_data_attributes = ['training_set_labels', 'training_set_flux', 'training_set_ivar']¶

_descriptive_attributes = ['vectorizer', 'censors', 'regularization', 'dispersion']¶

_trained_attributes = ['theta', 's2']¶

training_set_labels¶: Return the labels in the training set.

training_set_flux¶: Return the training set fluxes.

training_set_ivar¶: Return the inverse variances of the training set fluxes.

vectorizer¶: Return the vectorizer for this model.

design_matrix¶: Return the design matrix for this model.

theta¶: Return the theta coefficients (spectral model derivatives).

s2¶: Return the intrinsic variance (s^2) for all pixels.

censors¶: Return the wavelength censor masks for the labels.

dispersion¶: Return the dispersion points for all pixels.

regularization¶: Return the strength of the L1 regularization for this model.

is_trained¶: Return true or false for whether the model is trained.

__str__(self)¶: Return str(self).

__repr__(self)¶: Return repr(self).

_censored_design_matrix(self, pixel_index, fill_value=np.nan)¶

Return a censored design matrix for the given pixel index, and a mask of which theta values to ignore when fitting.

Parameters:	pixel_index – The zero-indexed pixel.
Returns:	A two-length tuple containing the censored design mask for this pixel, and a boolean mask of values to exclude when fitting for the spectral derivatives.

reset(self)¶: Clear any attributes that have been trained.

_pixel_access(self, array, index, default=None)¶

Safely access a (potentially per-pixel) attribute of the model.

Parameters:	array – Either `None`, a float value, or an array the size of the dispersion array. index – The zero-indexed pixel to attempt to access. default – [optional] The default value to return if `array` is None.

_verify_training_data(self, rho_warning=0.9)¶

Verify the training data for the appropriate shape and content.

Parameters:	rho_warning – [optional] Maximum correlation value between labels before a warning is given.

in_convex_hull(self, labels)¶

Return whether the provided labels are inside a complex hull constructed from the labelled set.

Parameters:	labels – A `NxK` array of `N` sets of `K` labels, where `K` is the number of labels that make up the vectorizer.
Returns:	A boolean array as to whether the points are in the complex hull of the labelled set.

write(self, path, include_training_set_spectra=False, overwrite=False, protocol=-1)¶

Serialise the trained model and save it to disk. This will save all relevant training attributes, and optionally, the training data.

Parameters:

Parameters:	path – The path to save the model to. include_training_set_spectra – [optional] Save the labelled set, normalised flux and inverse variance used to train the model. overwrite – [optional] Overwrite the existing file path, if it already exists. protocol – [optional] The Python pickling protocol to employ. Use 2 for compatibility with previous Python releases, -1 for performance.

path – The path to save the model to.
include_training_set_spectra – [optional] Save the labelled set, normalised flux and inverse variance used to train the model.
overwrite – [optional] Overwrite the existing file path, if it already exists.
protocol – [optional] The Python pickling protocol to employ. Use 2 for compatibility with previous Python releases, -1 for performance.

classmethod read(cls, path, **kwargs)¶

Read a saved model from disk.

Parameters:	path – The path where to load the model from.

train(self, threads=None, op_method=None, op_strict=True, op_kwds=None, **kwargs)¶

Train the model.

Parameters:	threads – [optional] The number of parallel threads to use. op_method – [optional] The optimization algorithm to use: l_bfgs_b (default) and powell are available. op_strict – [optional] Default to Powell’s optimization method if BFGS fails. op_kwds – Keyword arguments to provide directly to the optimization function.
Returns:	A three-length tuple containing the spectral coefficients `theta`, the squared scatter term at each pixel `s2`, and metadata related to the training of each pixel.

__call__(self, labels)¶

Return spectral fluxes, given the labels.

Parameters:	labels – An array of stellar labels.

test(self, flux, ivar, dispersion=None, initial_labels=None, initialisations=1, threads=None, use_derivatives=True, op_kwds=None, **kwargs)¶

Run the test step on spectra.

Parameters:

Parameters:	flux – The (pseudo-continuum-normalized) spectral flux. ivar – The inverse variance values for the spectral fluxes. dispersion – [optional] The dispersion values for the given flux and inverse variances. If given, then the model will be interpolated to these dispersion values at runtime. initial_labels – [optional] The initial labels to try for each spectrum. This can be a single set of initial values, or one set of initial values for each star. initialisations – [optional] The number of initial starting points to use, based on percentiles of the labels in the training set. This is ignored if`initial_labels` is given. threads – [optional] The number of parallel threads to use. use_derivatives – [optional] Boolean `True` indicating to use analytic derivatives provided by the vectorizer, `None` to calculate on the fly, or a callable function to calculate your own derivatives. op_kwds – [optional] Optimization keywords that get passed to `scipy.optimize.leastsq`.

flux – The (pseudo-continuum-normalized) spectral flux.
ivar – The inverse variance values for the spectral fluxes.
dispersion – [optional] The dispersion values for the given flux and inverse variances. If given, then the model will be interpolated to these dispersion values at runtime.
initial_labels – [optional] The initial labels to try for each spectrum. This can be a single set of initial values, or one set of initial values for each star.
initialisations – [optional] The number of initial starting points to use, based on percentiles of the labels in the training set. This is ignored if`initial_labels` is given.
threads – [optional] The number of parallel threads to use.
use_derivatives – [optional] Boolean True indicating to use analytic derivatives provided by the vectorizer, None to calculate on the fly, or a callable function to calculate your own derivatives.
op_kwds – [optional] Optimization keywords that get passed to scipy.optimize.leastsq.

_initial_theta(self, pixel_index, **kwargs)¶

Return a list of guesses of the spectral coefficients for the given pixel index. Initial values are sourced in the following preference order:

a previously trained theta value for this pixel,

an estimate of theta using linear algebra,

a neighbouring pixel’s theta value,

the fiducial value of [1, 0, …, 0].

Parameters:	pixel_index – The zero-indexed integer of the pixel.
Returns:	A list of initial theta guesses, and the source of each guess.

astra.contrib.thecannon¶

Subpackages¶

Submodules¶

Package Contents¶

Classes¶

`astra.contrib.thecannon`¶