astra.tasks.io.sdss5

Module Contents

Classes

SDSS5DataModelTask A task to represent a SDSS-V data product.
ApVisitFile A task to represent a SDSS-V ApVisit data product.
ApStarFile A task to represent a SDSS-V ApStar data product.
SpecFile A task to represent a (co-added) SDSS-V BHM Spec data product.
class astra.tasks.io.sdss5.SDSS5DataModelTask(*args, **kwargs)

A task to represent a SDSS-V data product.

release
public
use_remote
remote_access_method
mirror
verbose
tree
local_path

The local path of the file.

remote_path

The remote path of the file. Useful for debugging path problems.

This is relatively expensive to return, so don’t use this to download sources. Instead use one instance of sdss_access.HttpAccess to get the remote paths of many sources.

astra_version_major
astra_version_minor
astra_version_micro
astra_version_dev
strict_output_checking
is_batch_mode

A boolean property indicating whether the task is in batch mode or not.

output_base_dir

Base directory for storing task outputs.

_event_callbacks
priority = 0
disabled = False
resources
worker_timeout
max_batch_size
batchable

True if this instance can be run as part of a batch. By default, True if it has any batched parameters

retry_count

Override this positive integer to have different retry_count at task level Check scheduler-config

disable_hard_timeout

Override this positive integer to have different disable_hard_timeout at task level. Check scheduler-config

disable_window

Override this positive integer to have different disable_window at task level. Check scheduler-config

disable_window_seconds
owner_email

Override this to send out additional error emails to task owner, in addition to the one defined in the global configuration. This should return a string or a list of strings. e.g. ‘test@exmaple.com’ or [‘test1@example.com’, ‘test2@example.com’]

use_cmdline_section

Property used by core config such as --workers etc. These will be exposed without the class as prefix.

accepts_messages

For configuring which scheduler messages can be received. When falsy, this tasks does not accept any message. When True, all messages are accepted.

task_module

Returns what Python module to import to get access to this class.

_visible_in_registry = True
__not_user_specified = __not_user_specified
_namespace_at_class_time
task_namespace

This value can be overriden to set the namespace that will be used. (See Task.namespaces_famlies_and_ids) If it’s not specified and you try to read this value anyway, it will return garbage. Please use get_task_namespace() to read the namespace.

Note that setting this value with @property will not work, because this is a class level value.

task_family

DEPRECATED since after 2.4.0. See get_task_family() instead. Hopefully there will be less meta magic in Luigi.

Convenience method since a property on the metaclass isn’t directly accessible through the class instances.

param_args
classmethod get_local_path(cls, release, public=True, mirror=False, verbose=True, **kwargs)
output(self)

The outputs of this task.

get_remote_http(self)

Download the remote file using HTTP.

get_remote_rsync(self)

Download the remote file using rsync.

get_remote(self)

Download the remote file.

_warn_on_wrong_param_types(self, strict=False)
__repr__(self)

Build a task representation like MyTask(hash: param1=1.5, param2='5')

get_common_param_kwargs(self, klass, include_significant=True)
get_common_param_names(self, klass, include_significant=True)
get_hashed_params(self, only_significant=True, only_public=False)
to_str_params(self, only_significant=True, only_public=False)

Convert all parameters to a str->str hash.

classmethod from_str_params(cls, params_str)

Creates an instance from a str->str hash. :param params_str: dict of param name -> value as string.

get_batch_task_kwds(self, include_non_batch_keywords=True)
get_batch_tasks(self)

A generator that yields task(s) that are to be run. Works in single or batch mode.

get_batch_size(self)

Get the number of batched tasks.

get_input(self, key)

Return a single input from the task, assuming the inputs are a dictionary. This can be performed by using task.input()[key], but when there are many inputs (e.g., in batch mode), this can be unnecessarily slow.

Parameters:key – The key of the requirements dictionary to return.
requires(self)

The requirements of this task.

query_state(self, full_output=False)

Query the database for this task and return the SQLAlchemy ORM Query.

Parameters:full_output – [optional] Optionally return a three-length tuple containing the ORM query, database model, and keywords to filter by.
get_or_create_state(self, defaults=None)

Get (or create) an entry in the database for this task.

Note that this will only create an entry for the task, and not for the parameters of the task. This is useful when creating many task entries, with the intent you will create the parameter entries later, and you want to minimise overhead. If you want to create an entry for this task and the parameters, use create_state().

This function returns a two-length tuple containing the SQLAlchemy instance, and a boolean flag indicating whether the entry was created (True) or just retrieved (False).

Parameters:defaults – [optional] A dictionary of default key, value pairs to provide if the entry needs to be created in the database.
create_state(self)

Create an entry in the database for this task, and its parameters.

delete_state(self, cascade=False)

Delete this task entry in the database.

Parameters:cascade – [optional] Cascade this to any tasks in this batch.
update_state(self, state, cascade=False)

Update the task entry in the database with the given state dictionary.

Parameters:cascade – [optional] Cascade this to any tasks in this batch.
trigger_event_start(self)

Trigger an event signalling that the task has started.

trigger_event_succeeded(self)

Trigger an event signalling that the task has succeeded.

trigger_event_failed(self)

Trigger an event signalling that the task has failed.

trigger_event_processing_time(self, duration, cascade=False)

Trigger the event that signals the processing time of the event.

Parameters:
  • duration – The time taken for this event.
  • cascade – [optional] Also trigger the task succeeded event (default: False).
_owner_list(self)

Turns the owner_email property into a list. This should not be overridden.

classmethod event_handler(cls, event)

Decorator for adding event handlers.

trigger_event(self, event, *args, **kwargs)

Trigger that calls all of the specified events associated with this class.

classmethod get_task_namespace(cls)

The task family for the given class.

Note: You normally don’t want to override this.

classmethod get_task_family(cls)

The task family for the given class.

If task_namespace is not set, then it’s simply the name of the class. Otherwise, <task_namespace>. is prefixed to the class name.

Note: You normally don’t want to override this.

classmethod get_params(cls)

Returns all of the Parameters for this Task.

classmethod batch_param_names(cls)
classmethod get_param_names(cls, include_significant=False)
classmethod get_param_values(cls, params, args, kwargs)

Get the values of the parameters from the args and kwargs.

Parameters:
  • params – list of (param_name, Parameter).
  • args – positional arguments
  • kwargs – keyword arguments.
Returns:

list of (name, value) tuples, one for each parameter.

initialized(self)

Returns True if the Task is initialized and False otherwise.

_get_param_visibilities(self)
clone(self, cls=None, **kwargs)

Creates a new instance from an existing instance where some of the args have changed.

There’s at least two scenarios where this is useful (see test/clone_test.py):

  • remove a lot of boiler plate when you have recursive dependencies and lots of args
  • there’s task inheritance and some logic is on the base class
Parameters:
  • cls
  • kwargs
Returns:

__hash__(self)

Return hash(self).

__eq__(self, other)

Return self==value.

complete(self)

If the task has any outputs, return True if all outputs exist. Otherwise, return False.

However, you may freely override this method with custom logic.

classmethod bulk_complete(cls, parameter_tuples)

Returns those of parameter_tuples for which this Task is complete.

Override (with an efficient implementation) for efficient scheduling with range tools. Keep the logic consistent with that of complete().

_requires(self)

Override in “template” tasks which themselves are supposed to be subclassed and thus have their requires() overridden (name preserved to provide consistent end-user experience), yet need to introduce (non-input) dependencies.

Must return an iterable which among others contains the _requires() of the superclass.

process_resources(self)

Override in “template” tasks which provide common resource functionality but allow subclasses to specify additional resources while preserving the name for consistent end-user experience.

input(self)

Returns the outputs of the Tasks returned by requires()

See Task.input

Returns:a list of Target objects which are specified as outputs of all required Tasks.
deps(self)

Internal method used by the scheduler.

Returns the flattened list of requires.

run(self)

The task run method, to be overridden in a subclass.

See Task.run

on_failure(self, exception)

Override for custom error handling.

This method gets called if an exception is raised in run(). The returned value of this method is json encoded and sent to the scheduler as the expl argument. Its string representation will be used as the body of the error email sent out if any.

Default behavior is to return a string representation of the stack trace.

on_success(self)

Override for doing custom completion handling for a larger class of tasks

This method gets called when run() completes without raising any exceptions.

The returned value is json encoded and sent to the scheduler as the expl argument.

Default behavior is to send an None value

no_unpicklable_properties(self)

Remove unpicklable properties before dump task and resume them after.

This method could be called in subtask’s dump method, to ensure unpicklable properties won’t break dump.

This method is a context-manager which can be called as below:

class astra.tasks.io.sdss5.ApVisitFile(*args, **kwargs)

A task to represent a SDSS-V ApVisit data product.

Parameters:
  • fiber – The fiber number that the object was observed with.
  • plate – The plate identifier.
  • telescope – The name of the telescope used to observe the object (e.g., apo25m).
  • field – The field the object was observed in.
  • mjd – The Modified Julian Date of the observation.
  • apred – The data reduction version number (e.g., daily).
  • release – The name of the SDSS data release (e.g., sdss5).
sdss_data_model_name = apVisit
fiber
plate
field
mjd
apred
telescope
release
public
use_remote
remote_access_method
mirror
verbose
tree
local_path

The local path of the file.

remote_path

The remote path of the file. Useful for debugging path problems.

This is relatively expensive to return, so don’t use this to download sources. Instead use one instance of sdss_access.HttpAccess to get the remote paths of many sources.

astra_version_major
astra_version_minor
astra_version_micro
astra_version_dev
strict_output_checking
is_batch_mode

A boolean property indicating whether the task is in batch mode or not.

output_base_dir

Base directory for storing task outputs.

_event_callbacks
priority = 0
disabled = False
resources
worker_timeout
max_batch_size
batchable

True if this instance can be run as part of a batch. By default, True if it has any batched parameters

retry_count

Override this positive integer to have different retry_count at task level Check scheduler-config

disable_hard_timeout

Override this positive integer to have different disable_hard_timeout at task level. Check scheduler-config

disable_window

Override this positive integer to have different disable_window at task level. Check scheduler-config

disable_window_seconds
owner_email

Override this to send out additional error emails to task owner, in addition to the one defined in the global configuration. This should return a string or a list of strings. e.g. ‘test@exmaple.com’ or [‘test1@example.com’, ‘test2@example.com’]

use_cmdline_section

Property used by core config such as --workers etc. These will be exposed without the class as prefix.

accepts_messages

For configuring which scheduler messages can be received. When falsy, this tasks does not accept any message. When True, all messages are accepted.

task_module

Returns what Python module to import to get access to this class.

_visible_in_registry = True
__not_user_specified = __not_user_specified
_namespace_at_class_time
task_namespace

This value can be overriden to set the namespace that will be used. (See Task.namespaces_famlies_and_ids) If it’s not specified and you try to read this value anyway, it will return garbage. Please use get_task_namespace() to read the namespace.

Note that setting this value with @property will not work, because this is a class level value.

task_family

DEPRECATED since after 2.4.0. See get_task_family() instead. Hopefully there will be less meta magic in Luigi.

Convenience method since a property on the metaclass isn’t directly accessible through the class instances.

param_args
classmethod get_local_path(cls, release, public=True, mirror=False, verbose=True, **kwargs)
output(self)

The outputs of this task.

get_remote_http(self)

Download the remote file using HTTP.

get_remote_rsync(self)

Download the remote file using rsync.

get_remote(self)

Download the remote file.

_warn_on_wrong_param_types(self, strict=False)
__repr__(self)

Build a task representation like MyTask(hash: param1=1.5, param2='5')

get_common_param_kwargs(self, klass, include_significant=True)
get_common_param_names(self, klass, include_significant=True)
get_hashed_params(self, only_significant=True, only_public=False)
to_str_params(self, only_significant=True, only_public=False)

Convert all parameters to a str->str hash.

classmethod from_str_params(cls, params_str)

Creates an instance from a str->str hash. :param params_str: dict of param name -> value as string.

get_batch_task_kwds(self, include_non_batch_keywords=True)
get_batch_tasks(self)

A generator that yields task(s) that are to be run. Works in single or batch mode.

get_batch_size(self)

Get the number of batched tasks.

get_input(self, key)

Return a single input from the task, assuming the inputs are a dictionary. This can be performed by using task.input()[key], but when there are many inputs (e.g., in batch mode), this can be unnecessarily slow.

Parameters:key – The key of the requirements dictionary to return.
requires(self)

The requirements of this task.

query_state(self, full_output=False)

Query the database for this task and return the SQLAlchemy ORM Query.

Parameters:full_output – [optional] Optionally return a three-length tuple containing the ORM query, database model, and keywords to filter by.
get_or_create_state(self, defaults=None)

Get (or create) an entry in the database for this task.

Note that this will only create an entry for the task, and not for the parameters of the task. This is useful when creating many task entries, with the intent you will create the parameter entries later, and you want to minimise overhead. If you want to create an entry for this task and the parameters, use create_state().

This function returns a two-length tuple containing the SQLAlchemy instance, and a boolean flag indicating whether the entry was created (True) or just retrieved (False).

Parameters:defaults – [optional] A dictionary of default key, value pairs to provide if the entry needs to be created in the database.
create_state(self)

Create an entry in the database for this task, and its parameters.

delete_state(self, cascade=False)

Delete this task entry in the database.

Parameters:cascade – [optional] Cascade this to any tasks in this batch.
update_state(self, state, cascade=False)

Update the task entry in the database with the given state dictionary.

Parameters:cascade – [optional] Cascade this to any tasks in this batch.
trigger_event_start(self)

Trigger an event signalling that the task has started.

trigger_event_succeeded(self)

Trigger an event signalling that the task has succeeded.

trigger_event_failed(self)

Trigger an event signalling that the task has failed.

trigger_event_processing_time(self, duration, cascade=False)

Trigger the event that signals the processing time of the event.

Parameters:
  • duration – The time taken for this event.
  • cascade – [optional] Also trigger the task succeeded event (default: False).
_owner_list(self)

Turns the owner_email property into a list. This should not be overridden.

classmethod event_handler(cls, event)

Decorator for adding event handlers.

trigger_event(self, event, *args, **kwargs)

Trigger that calls all of the specified events associated with this class.

classmethod get_task_namespace(cls)

The task family for the given class.

Note: You normally don’t want to override this.

classmethod get_task_family(cls)

The task family for the given class.

If task_namespace is not set, then it’s simply the name of the class. Otherwise, <task_namespace>. is prefixed to the class name.

Note: You normally don’t want to override this.

classmethod get_params(cls)

Returns all of the Parameters for this Task.

classmethod batch_param_names(cls)
classmethod get_param_names(cls, include_significant=False)
classmethod get_param_values(cls, params, args, kwargs)

Get the values of the parameters from the args and kwargs.

Parameters:
  • params – list of (param_name, Parameter).
  • args – positional arguments
  • kwargs – keyword arguments.
Returns:

list of (name, value) tuples, one for each parameter.

initialized(self)

Returns True if the Task is initialized and False otherwise.

_get_param_visibilities(self)
clone(self, cls=None, **kwargs)

Creates a new instance from an existing instance where some of the args have changed.

There’s at least two scenarios where this is useful (see test/clone_test.py):

  • remove a lot of boiler plate when you have recursive dependencies and lots of args
  • there’s task inheritance and some logic is on the base class
Parameters:
  • cls
  • kwargs
Returns:

__hash__(self)

Return hash(self).

__eq__(self, other)

Return self==value.

complete(self)

If the task has any outputs, return True if all outputs exist. Otherwise, return False.

However, you may freely override this method with custom logic.

classmethod bulk_complete(cls, parameter_tuples)

Returns those of parameter_tuples for which this Task is complete.

Override (with an efficient implementation) for efficient scheduling with range tools. Keep the logic consistent with that of complete().

_requires(self)

Override in “template” tasks which themselves are supposed to be subclassed and thus have their requires() overridden (name preserved to provide consistent end-user experience), yet need to introduce (non-input) dependencies.

Must return an iterable which among others contains the _requires() of the superclass.

process_resources(self)

Override in “template” tasks which provide common resource functionality but allow subclasses to specify additional resources while preserving the name for consistent end-user experience.

input(self)

Returns the outputs of the Tasks returned by requires()

See Task.input

Returns:a list of Target objects which are specified as outputs of all required Tasks.
deps(self)

Internal method used by the scheduler.

Returns the flattened list of requires.

run(self)

The task run method, to be overridden in a subclass.

See Task.run

on_failure(self, exception)

Override for custom error handling.

This method gets called if an exception is raised in run(). The returned value of this method is json encoded and sent to the scheduler as the expl argument. Its string representation will be used as the body of the error email sent out if any.

Default behavior is to return a string representation of the stack trace.

on_success(self)

Override for doing custom completion handling for a larger class of tasks

This method gets called when run() completes without raising any exceptions.

The returned value is json encoded and sent to the scheduler as the expl argument.

Default behavior is to send an None value

no_unpicklable_properties(self)

Remove unpicklable properties before dump task and resume them after.

This method could be called in subtask’s dump method, to ensure unpicklable properties won’t break dump.

This method is a context-manager which can be called as below:

class astra.tasks.io.sdss5.ApStarFile(*args, **kwargs)

A task to represent a SDSS-V ApStar data product.

Parameters:
  • obj – The name of the object.
  • healpix – The healpix identifier based on the object location.
  • telescope – The name of the telescope used to observe the object (e.g., apo25m).
  • apstar – A string indicating the kind of object (usually ‘star’).
  • apred – The data reduction version number (e.g., daily).
  • release – The name of the SDSS data release (e.g., sdss5).
sdss_data_model_name = apStar
obj
healpix
apstar
apred
telescope
release
public
use_remote
remote_access_method
mirror
verbose
tree
local_path

The local path of the file.

remote_path

The remote path of the file. Useful for debugging path problems.

This is relatively expensive to return, so don’t use this to download sources. Instead use one instance of sdss_access.HttpAccess to get the remote paths of many sources.

astra_version_major
astra_version_minor
astra_version_micro
astra_version_dev
strict_output_checking
is_batch_mode

A boolean property indicating whether the task is in batch mode or not.

output_base_dir

Base directory for storing task outputs.

_event_callbacks
priority = 0
disabled = False
resources
worker_timeout
max_batch_size
batchable

True if this instance can be run as part of a batch. By default, True if it has any batched parameters

retry_count

Override this positive integer to have different retry_count at task level Check scheduler-config

disable_hard_timeout

Override this positive integer to have different disable_hard_timeout at task level. Check scheduler-config

disable_window

Override this positive integer to have different disable_window at task level. Check scheduler-config

disable_window_seconds
owner_email

Override this to send out additional error emails to task owner, in addition to the one defined in the global configuration. This should return a string or a list of strings. e.g. ‘test@exmaple.com’ or [‘test1@example.com’, ‘test2@example.com’]

use_cmdline_section

Property used by core config such as --workers etc. These will be exposed without the class as prefix.

accepts_messages

For configuring which scheduler messages can be received. When falsy, this tasks does not accept any message. When True, all messages are accepted.

task_module

Returns what Python module to import to get access to this class.

_visible_in_registry = True
__not_user_specified = __not_user_specified
_namespace_at_class_time
task_namespace

This value can be overriden to set the namespace that will be used. (See Task.namespaces_famlies_and_ids) If it’s not specified and you try to read this value anyway, it will return garbage. Please use get_task_namespace() to read the namespace.

Note that setting this value with @property will not work, because this is a class level value.

task_family

DEPRECATED since after 2.4.0. See get_task_family() instead. Hopefully there will be less meta magic in Luigi.

Convenience method since a property on the metaclass isn’t directly accessible through the class instances.

param_args
get_or_create_data_model_relationships(self)

Return the keywords that reference the input data model for this task.

writer(self, spectrum, path, **kwargs)
classmethod get_local_path(cls, release, public=True, mirror=False, verbose=True, **kwargs)
output(self)

The outputs of this task.

get_remote_http(self)

Download the remote file using HTTP.

get_remote_rsync(self)

Download the remote file using rsync.

get_remote(self)

Download the remote file.

_warn_on_wrong_param_types(self, strict=False)
__repr__(self)

Build a task representation like MyTask(hash: param1=1.5, param2='5')

get_common_param_kwargs(self, klass, include_significant=True)
get_common_param_names(self, klass, include_significant=True)
get_hashed_params(self, only_significant=True, only_public=False)
to_str_params(self, only_significant=True, only_public=False)

Convert all parameters to a str->str hash.

classmethod from_str_params(cls, params_str)

Creates an instance from a str->str hash. :param params_str: dict of param name -> value as string.

get_batch_task_kwds(self, include_non_batch_keywords=True)
get_batch_tasks(self)

A generator that yields task(s) that are to be run. Works in single or batch mode.

get_batch_size(self)

Get the number of batched tasks.

get_input(self, key)

Return a single input from the task, assuming the inputs are a dictionary. This can be performed by using task.input()[key], but when there are many inputs (e.g., in batch mode), this can be unnecessarily slow.

Parameters:key – The key of the requirements dictionary to return.
requires(self)

The requirements of this task.

query_state(self, full_output=False)

Query the database for this task and return the SQLAlchemy ORM Query.

Parameters:full_output – [optional] Optionally return a three-length tuple containing the ORM query, database model, and keywords to filter by.
get_or_create_state(self, defaults=None)

Get (or create) an entry in the database for this task.

Note that this will only create an entry for the task, and not for the parameters of the task. This is useful when creating many task entries, with the intent you will create the parameter entries later, and you want to minimise overhead. If you want to create an entry for this task and the parameters, use create_state().

This function returns a two-length tuple containing the SQLAlchemy instance, and a boolean flag indicating whether the entry was created (True) or just retrieved (False).

Parameters:defaults – [optional] A dictionary of default key, value pairs to provide if the entry needs to be created in the database.
create_state(self)

Create an entry in the database for this task, and its parameters.

delete_state(self, cascade=False)

Delete this task entry in the database.

Parameters:cascade – [optional] Cascade this to any tasks in this batch.
update_state(self, state, cascade=False)

Update the task entry in the database with the given state dictionary.

Parameters:cascade – [optional] Cascade this to any tasks in this batch.
trigger_event_start(self)

Trigger an event signalling that the task has started.

trigger_event_succeeded(self)

Trigger an event signalling that the task has succeeded.

trigger_event_failed(self)

Trigger an event signalling that the task has failed.

trigger_event_processing_time(self, duration, cascade=False)

Trigger the event that signals the processing time of the event.

Parameters:
  • duration – The time taken for this event.
  • cascade – [optional] Also trigger the task succeeded event (default: False).
_owner_list(self)

Turns the owner_email property into a list. This should not be overridden.

classmethod event_handler(cls, event)

Decorator for adding event handlers.

trigger_event(self, event, *args, **kwargs)

Trigger that calls all of the specified events associated with this class.

classmethod get_task_namespace(cls)

The task family for the given class.

Note: You normally don’t want to override this.

classmethod get_task_family(cls)

The task family for the given class.

If task_namespace is not set, then it’s simply the name of the class. Otherwise, <task_namespace>. is prefixed to the class name.

Note: You normally don’t want to override this.

classmethod get_params(cls)

Returns all of the Parameters for this Task.

classmethod batch_param_names(cls)
classmethod get_param_names(cls, include_significant=False)
classmethod get_param_values(cls, params, args, kwargs)

Get the values of the parameters from the args and kwargs.

Parameters:
  • params – list of (param_name, Parameter).
  • args – positional arguments
  • kwargs – keyword arguments.
Returns:

list of (name, value) tuples, one for each parameter.

initialized(self)

Returns True if the Task is initialized and False otherwise.

_get_param_visibilities(self)
clone(self, cls=None, **kwargs)

Creates a new instance from an existing instance where some of the args have changed.

There’s at least two scenarios where this is useful (see test/clone_test.py):

  • remove a lot of boiler plate when you have recursive dependencies and lots of args
  • there’s task inheritance and some logic is on the base class
Parameters:
  • cls
  • kwargs
Returns:

__hash__(self)

Return hash(self).

__eq__(self, other)

Return self==value.

complete(self)

If the task has any outputs, return True if all outputs exist. Otherwise, return False.

However, you may freely override this method with custom logic.

classmethod bulk_complete(cls, parameter_tuples)

Returns those of parameter_tuples for which this Task is complete.

Override (with an efficient implementation) for efficient scheduling with range tools. Keep the logic consistent with that of complete().

_requires(self)

Override in “template” tasks which themselves are supposed to be subclassed and thus have their requires() overridden (name preserved to provide consistent end-user experience), yet need to introduce (non-input) dependencies.

Must return an iterable which among others contains the _requires() of the superclass.

process_resources(self)

Override in “template” tasks which provide common resource functionality but allow subclasses to specify additional resources while preserving the name for consistent end-user experience.

input(self)

Returns the outputs of the Tasks returned by requires()

See Task.input

Returns:a list of Target objects which are specified as outputs of all required Tasks.
deps(self)

Internal method used by the scheduler.

Returns the flattened list of requires.

run(self)

The task run method, to be overridden in a subclass.

See Task.run

on_failure(self, exception)

Override for custom error handling.

This method gets called if an exception is raised in run(). The returned value of this method is json encoded and sent to the scheduler as the expl argument. Its string representation will be used as the body of the error email sent out if any.

Default behavior is to return a string representation of the stack trace.

on_success(self)

Override for doing custom completion handling for a larger class of tasks

This method gets called when run() completes without raising any exceptions.

The returned value is json encoded and sent to the scheduler as the expl argument.

Default behavior is to send an None value

no_unpicklable_properties(self)

Remove unpicklable properties before dump task and resume them after.

This method could be called in subtask’s dump method, to ensure unpicklable properties won’t break dump.

This method is a context-manager which can be called as below:

class astra.tasks.io.sdss5.SpecFile(*args, **kwargs)

A task to represent a (co-added) SDSS-V BHM Spec data product.

Parameters:
  • catalogid – The catalog identifier of the object.
  • plate – The identifier of the plate used for observations.
  • run2d – The version of the data reduction pipeline used.
  • release – The name of the SDSS data release (e.g., sdss5).
sdss_data_model_name = spec
catalogid
mjd
plate
run2d
local_path

A monkey-patch while upstream people figure out their shit.

release
public
use_remote
remote_access_method
mirror
verbose
tree
remote_path

The remote path of the file. Useful for debugging path problems.

This is relatively expensive to return, so don’t use this to download sources. Instead use one instance of sdss_access.HttpAccess to get the remote paths of many sources.

astra_version_major
astra_version_minor
astra_version_micro
astra_version_dev
strict_output_checking
is_batch_mode

A boolean property indicating whether the task is in batch mode or not.

output_base_dir

Base directory for storing task outputs.

_event_callbacks
priority = 0
disabled = False
resources
worker_timeout
max_batch_size
batchable

True if this instance can be run as part of a batch. By default, True if it has any batched parameters

retry_count

Override this positive integer to have different retry_count at task level Check scheduler-config

disable_hard_timeout

Override this positive integer to have different disable_hard_timeout at task level. Check scheduler-config

disable_window

Override this positive integer to have different disable_window at task level. Check scheduler-config

disable_window_seconds
owner_email

Override this to send out additional error emails to task owner, in addition to the one defined in the global configuration. This should return a string or a list of strings. e.g. ‘test@exmaple.com’ or [‘test1@example.com’, ‘test2@example.com’]

use_cmdline_section

Property used by core config such as --workers etc. These will be exposed without the class as prefix.

accepts_messages

For configuring which scheduler messages can be received. When falsy, this tasks does not accept any message. When True, all messages are accepted.

task_module

Returns what Python module to import to get access to this class.

_visible_in_registry = True
__not_user_specified = __not_user_specified
_namespace_at_class_time
task_namespace

This value can be overriden to set the namespace that will be used. (See Task.namespaces_famlies_and_ids) If it’s not specified and you try to read this value anyway, it will return garbage. Please use get_task_namespace() to read the namespace.

Note that setting this value with @property will not work, because this is a class level value.

task_family

DEPRECATED since after 2.4.0. See get_task_family() instead. Hopefully there will be less meta magic in Luigi.

Convenience method since a property on the metaclass isn’t directly accessible through the class instances.

param_args
classmethod get_local_path(cls, release, public=True, mirror=False, verbose=True, **kwargs)
output(self)

The outputs of this task.

get_remote_http(self)

Download the remote file using HTTP.

get_remote_rsync(self)

Download the remote file using rsync.

get_remote(self)

Download the remote file.

_warn_on_wrong_param_types(self, strict=False)
__repr__(self)

Build a task representation like MyTask(hash: param1=1.5, param2='5')

get_common_param_kwargs(self, klass, include_significant=True)
get_common_param_names(self, klass, include_significant=True)
get_hashed_params(self, only_significant=True, only_public=False)
to_str_params(self, only_significant=True, only_public=False)

Convert all parameters to a str->str hash.

classmethod from_str_params(cls, params_str)

Creates an instance from a str->str hash. :param params_str: dict of param name -> value as string.

get_batch_task_kwds(self, include_non_batch_keywords=True)
get_batch_tasks(self)

A generator that yields task(s) that are to be run. Works in single or batch mode.

get_batch_size(self)

Get the number of batched tasks.

get_input(self, key)

Return a single input from the task, assuming the inputs are a dictionary. This can be performed by using task.input()[key], but when there are many inputs (e.g., in batch mode), this can be unnecessarily slow.

Parameters:key – The key of the requirements dictionary to return.
requires(self)

The requirements of this task.

query_state(self, full_output=False)

Query the database for this task and return the SQLAlchemy ORM Query.

Parameters:full_output – [optional] Optionally return a three-length tuple containing the ORM query, database model, and keywords to filter by.
get_or_create_state(self, defaults=None)

Get (or create) an entry in the database for this task.

Note that this will only create an entry for the task, and not for the parameters of the task. This is useful when creating many task entries, with the intent you will create the parameter entries later, and you want to minimise overhead. If you want to create an entry for this task and the parameters, use create_state().

This function returns a two-length tuple containing the SQLAlchemy instance, and a boolean flag indicating whether the entry was created (True) or just retrieved (False).

Parameters:defaults – [optional] A dictionary of default key, value pairs to provide if the entry needs to be created in the database.
create_state(self)

Create an entry in the database for this task, and its parameters.

delete_state(self, cascade=False)

Delete this task entry in the database.

Parameters:cascade – [optional] Cascade this to any tasks in this batch.
update_state(self, state, cascade=False)

Update the task entry in the database with the given state dictionary.

Parameters:cascade – [optional] Cascade this to any tasks in this batch.
trigger_event_start(self)

Trigger an event signalling that the task has started.

trigger_event_succeeded(self)

Trigger an event signalling that the task has succeeded.

trigger_event_failed(self)

Trigger an event signalling that the task has failed.

trigger_event_processing_time(self, duration, cascade=False)

Trigger the event that signals the processing time of the event.

Parameters:
  • duration – The time taken for this event.
  • cascade – [optional] Also trigger the task succeeded event (default: False).
_owner_list(self)

Turns the owner_email property into a list. This should not be overridden.

classmethod event_handler(cls, event)

Decorator for adding event handlers.

trigger_event(self, event, *args, **kwargs)

Trigger that calls all of the specified events associated with this class.

classmethod get_task_namespace(cls)

The task family for the given class.

Note: You normally don’t want to override this.

classmethod get_task_family(cls)

The task family for the given class.

If task_namespace is not set, then it’s simply the name of the class. Otherwise, <task_namespace>. is prefixed to the class name.

Note: You normally don’t want to override this.

classmethod get_params(cls)

Returns all of the Parameters for this Task.

classmethod batch_param_names(cls)
classmethod get_param_names(cls, include_significant=False)
classmethod get_param_values(cls, params, args, kwargs)

Get the values of the parameters from the args and kwargs.

Parameters:
  • params – list of (param_name, Parameter).
  • args – positional arguments
  • kwargs – keyword arguments.
Returns:

list of (name, value) tuples, one for each parameter.

initialized(self)

Returns True if the Task is initialized and False otherwise.

_get_param_visibilities(self)
clone(self, cls=None, **kwargs)

Creates a new instance from an existing instance where some of the args have changed.

There’s at least two scenarios where this is useful (see test/clone_test.py):

  • remove a lot of boiler plate when you have recursive dependencies and lots of args
  • there’s task inheritance and some logic is on the base class
Parameters:
  • cls
  • kwargs
Returns:

__hash__(self)

Return hash(self).

__eq__(self, other)

Return self==value.

complete(self)

If the task has any outputs, return True if all outputs exist. Otherwise, return False.

However, you may freely override this method with custom logic.

classmethod bulk_complete(cls, parameter_tuples)

Returns those of parameter_tuples for which this Task is complete.

Override (with an efficient implementation) for efficient scheduling with range tools. Keep the logic consistent with that of complete().

_requires(self)

Override in “template” tasks which themselves are supposed to be subclassed and thus have their requires() overridden (name preserved to provide consistent end-user experience), yet need to introduce (non-input) dependencies.

Must return an iterable which among others contains the _requires() of the superclass.

process_resources(self)

Override in “template” tasks which provide common resource functionality but allow subclasses to specify additional resources while preserving the name for consistent end-user experience.

input(self)

Returns the outputs of the Tasks returned by requires()

See Task.input

Returns:a list of Target objects which are specified as outputs of all required Tasks.
deps(self)

Internal method used by the scheduler.

Returns the flattened list of requires.

run(self)

The task run method, to be overridden in a subclass.

See Task.run

on_failure(self, exception)

Override for custom error handling.

This method gets called if an exception is raised in run(). The returned value of this method is json encoded and sent to the scheduler as the expl argument. Its string representation will be used as the body of the error email sent out if any.

Default behavior is to return a string representation of the stack trace.

on_success(self)

Override for doing custom completion handling for a larger class of tasks

This method gets called when run() completes without raising any exceptions.

The returned value is json encoded and sent to the scheduler as the expl argument.

Default behavior is to send an None value

no_unpicklable_properties(self)

Remove unpicklable properties before dump task and resume them after.

This method could be called in subtask’s dump method, to ensure unpicklable properties won’t break dump.

This method is a context-manager which can be called as below: