AutodiffComposition¶
Contents¶
Overview¶
AutodiffComposition is a subclass of Composition for constructing and training feedforward neural network either, using either direct compilation (to LLVM) or automatic conversion to PyTorch, both of which considerably accelerate training (by as much as three orders of magnitude) compared to the standard implementation of learning in a Composition. Although an AutodiffComposition is constructed and executed in much the same way as a standard Composition, it largely restricted to feedforward neural networks using supervised learning, and in particular the the backpropagation learning algorithm. although it can be used for some forms of unsupervised learning that are supported in PyTorch (e.g., self-organized maps).
Creating an AutodiffComposition¶
An AutodiffComposition can be created by calling its constructor, and then adding Components using
the standard Composition methods for doing so (e.g., add_node
,
add_projection
, add_linear_processing_pathway
, etc.). The constructor also includes a number of parameters that are
specific to the AutodiffComposition (see Class Reference for a list of these parameters,
and examples below). Note that all of the Components in an AutodiffComposition
must be able to be subject to learning, but cannot include any learning components themselves. Specifically, it cannot include any ModulatoryMechanisms, LearningProjections, or the ObjectiveMechanism <OBJECTIVE_MECHANISM>`
used to compute the loss for learning.
Warning
When an AutodiffComposition is constructed, it creates all of the learning Components that are needed, and thus cannot include any that are prespecified.
This means that an AutodiffComposition also cannot itself include a controller or any ControlMechanisms. However, it can include Mechanisms that are subject to modulatory control (see Figure, and modulation) by ControlMechanisms outside the Composition, including the controller of a Composition within which the AutodiffComposition is nested. That is, an AutodiffComposition can be nested in a Composition that has such other Components (see Nested Execution and Modulation below).
A few other restrictions apply to the construction and modification of AutodiffCompositions:
Hint
AutodiffComposition does not (currently) support the automatic construction of separate bias parameters. Thus, when comparing a model constructed using an AutodiffComposition to a corresponding model in PyTorch, the
bias
parameter of PyTorch modules should be set toFalse
. Trainable biases can be specified explicitly in an AutodiffComposition by including a TransferMechanism that projects to the relevant Mechanism (i.e., implementing that layer of the network to receive the biases) using a MappingProjection with amatrix
parameter that implements a diagnoal matrix with values corresponding to the initial value of the biases.Warning
Mechanisms or Projections should not be added to or deleted from an AutodiffComposition after it has been executed. Unlike an ordinary Composition, AutodiffComposition does not support this functionality.
Execution¶
An AutodiffComposition’s run
, execute
, and learn
methods are the same as for a Composition. However, the execution_mode in the learn
method has different effects than for a standard Composition, that determine whether it uses LLVM compilation or translation to PyTorch to execute learning.
This table provides a summary and comparison of these different modes of execution,
that are described in greater detail below.
LLVM mode¶
This is specified by setting execution_mode = ExecutionMode.LLVMRun
in the learn
method
of an AutodiffCompositon. This provides the fastest performance, but is limited to supervised learning using the BackPropagation
algorithm. This can be run using standard forms of
loss, including mean squared error (MSE) and cross entropy, by specifying this in the loss_spec argument of
the constructor (see AutodiffComposition for additional details, and
Compilation Modes for more information about executing a Composition in compiled mode.
Note
Specifying
ExecutionMode.LLVMRUn
in either thelearn
andrun
methods of an AutodiffComposition causes it to (attempt to) use compiled execution in both cases; this is because LLVM compilation supports the use of modulation in PsyNeuLink models (as compared to PyTorch mode; see note below).
PyTorch mode¶
This is specified by setting **execution_mode = ExecutionMode.PyTorch
in the learn
method of
an AutodiffCompositon (see example in Basics and Primer). This automatically
translates the AutodiffComposition to a PyTorch model and uses that for execution. This is
almost as fast as LLVM compilation
, but provides greater flexiblity. Although it too is
best suited for use with supervised learning, it can also be used for some forms
of unsupervised learning that are supported in PyTorch (e.g., self-organized
maps).
Note
While specifying
ExecutionMode.PyTorch
in thelearn
method of an AutodiffComposition causes it to use PyTorch for training, specifying this in therun
method causes it to be executing using the Python interpreter (and not PyTorch); this is so that any modulation can take effect during execution (see Nested Execution and Modulation below), which is not supported by PyTorch.
ExecutionMode.PyTorch
is a synonym for ExecutionMode.Python
, that is provided for clarity of the user interface:
the default for an AutodiffComposition (i.e., if execution_mode is not specified, or it is set to
ExecutionMode.Python
) is to use PyTorch translation in learn
but the Python interpreter
for run
. The use of ExecutionMode.PyTorch
is simply to make it clear that, during learning,
it will use PyTorch. This contrasts with the use of ExecutionMode.LLVMrun
, in which case both the learn
and run
methods use LLVM compilation.
Nested Execution and Modulation¶
Like any other Composition, an AutodiffComposition may be nested inside another
(see example below). However, learning, none of the internal
Components of the AutodiffComposition (e.g., intermediate layers of a neural network model) are accessible to the
other Components of the outer Composition, (e.g., as sources of information, or for modulation). However, when
it is executed using its run
method, then the AutodiffComposition functions like any other,
and all of its internal Components are accessible to other Components of the outer Composition. Thus, as long as access
to its internal Components is not needed during learning, an AutodiffComposition can be trained, and then used to
execute the trained Composition like any other.
Logging¶
Logging in AutodiffCompositions follows the same procedure as logging in a Composition. However, since an AutodiffComposition internally converts all of its Mechanisms either to LLVM or to an equivalent PyTorch model, then its inner components are not actually executed. This means that there is limited support for logging parameters of components inside an AutodiffComposition; Currently, the only supported parameters are:
Examples
The following is an example showing how to create a simple AutodiffComposition, specify its inputs and targets, and run it with learning enabled and disabled:
>>> import psyneulink as pnl
>>> # Set up PsyNeuLink Components
>>> my_mech_1 = pnl.TransferMechanism(function=pnl.Linear, size = 3)
>>> my_mech_2 = pnl.TransferMechanism(function=pnl.Linear, size = 2)
>>> my_projection = pnl.MappingProjection(matrix=np.random.randn(3,2),
... sender=my_mech_1,
... receiver=my_mech_2)
>>> # Create AutodiffComposition
>>> my_autodiff = pnl.AutodiffComposition()
>>> my_autodiff.add_node(my_mech_1)
>>> my_autodiff.add_node(my_mech_2)
>>> my_autodiff.add_projection(sender=my_mech_1, projection=my_projection, receiver=my_mech_2)
>>> # Specify inputs and targets
>>> my_inputs = {my_mech_1: [[1, 2, 3]]}
>>> my_targets = {my_mech_2: [[4, 5]]}
>>> input_dict = {"inputs": my_inputs, "targets": my_targets, "epochs": 2}
>>> # Run Composition in learnng mode
>>> my_autodiff.learn(inputs = input_dict)
>>> # Run Composition in test mode
>>> my_autodiff.run(inputs = input_dict['inputs'])
The following shows how the AutodiffComposition created in the previous example can be nested and run inside another Composition:
>>> # Create outer composition
>>> my_outer_composition = pnl.Composition()
>>> my_outer_composition.add_node(my_autodiff)
>>> # Specify dict containing inputs and targets for nested Composition
>>> training_input = {my_autodiff: input_dict}
>>> # Run in learning mode
>>> result1 = my_outer_composition.learn(inputs=training_input)
Class Reference¶
- class psyneulink.library.compositions.autodiffcomposition.AutodiffComposition(pathways=None, learning_rate=None, optimizer_type='sgd', weight_decay=0, loss_spec=Loss.MSE, disable_learning=False, refresh_losses=False, disable_cuda=True, cuda_index=None, force_no_retain_graph=False, name='autodiff_composition')¶
Subclass of Composition that trains models using either LLVM compilation or PyTorch; see and Composition for additional arguments and attributes.
- Parameters
learning_rate (float : default 0.001) – the learning rate passed to the optimizer if none is specified in the learn method of the AutodiffComposition.
disable_learning (bool: default False) – specifies whether the AutodiffComposition should disable learning when run in
learning mode
.optimizer_type (str : default 'sgd') – the kind of optimizer used in training. The current options are ‘sgd’ or ‘adam’.
weight_decay (float : default 0) – specifies the L2 penalty (which discourages large weights) used by the optimizer.
loss_spec (Loss or PyTorch loss function : default Loss.MSE) – specifies the loss function for training; see
Loss
for arguments.
- losses¶
tracks the average for each weight update (i.e. each minibatch)
- Type
list of floats
- optimizer¶
the optimizer used for training. Depends on the optimizer_type, learning_rate, and weight_decay arguments from initialization.
- Type
PyTorch optimizer function
- loss¶
the loss function used for training. Depends on the loss_spec argument from initialization.
- Type
PyTorch loss function
- class Parameters(owner, parent=None)¶
- _update_learning_parameters(context)¶
Updates parameters based on trials run since last update.
- _infer_output_nodes(nodes)¶
Maps targets onto target mechanisms (as needed by learning)
- Returns
- Return type
A dict mapping TargetMechanisms -> target values
- _infer_input_nodes(nodes)¶
Maps targets onto target mechanisms (as needed by learning)
- Returns
- Return type
A dict mapping TargetMechanisms -> target values
- learn(*args, **kwargs)¶
Runs the composition in learning mode - that is, any components with disable_learning False will be executed in learning mode. See Learning in a Composition for details.
- Parameters
inputs ({Node:list }) –
a dictionary containing a key-value pair for each Node (Mechanism or Composition) in the composition that receives inputs from the user. There are several equally valid ways that this dict can be structured:
For each pair, the key is the and the value is an input, the shape of which must match the Node’s default variable. This is identical to the input dict in the
run
method (see Input Dictionary for additional details).A dict with keys ‘inputs’, ‘targets’, and ‘epochs’. The
inputs
key stores a dict that is the same same structure as input specification (1) of learn. Thetargets
andepochs
keys should contain values of the same shape astargets
andepochs
.
targets ({Node:list }) – a dictionary containing a key-value pair for each Node in the Composition that receives target values as input to the Composition for training learning pathways. The key of each entry can be either the TARGET_MECHANISM for a learning pathway or the final Node in that Pathway, and the value is the target value used for that Node on each trial (see target inputs for additional details concerning the formatting of targets).
num_trials (int (default=None)) – typically, the Composition infers the number of trials to execute from the length of its input specification. However, num_trials can be used to enforce an exact number of trials to execute; if it is greater than there are inputs then inputs will be repeated (see Composition Inputs for additional information).
epochs (int (default=1)) – specifies the number of training epochs (that is, repetitions of the batched input set) to run with
learning_rate (float : default None) – specifies the learning_rate used by all learning pathways when the Composition’s learn method is called. This overrides the `learning_rate specified for any individual Pathways at construction, but only applies for the current execution of the learn method.
minibatch_size (int (default=1)) – specifies the size of the minibatches to use. The input trials will be batched and run, after which learning mechanisms with learning mode TRIAL will update weights
randomize_minibatch (bool (default=False)) – specifies whether the order of the input trials should be randomized on each epoch
patience (int or None (default=None)) – used for early stopping of training; If a model has more than
patience
bad consecutive epochs, thenlearn
will prematurely return. A bad epoch is determined by themin_delta
valuemin_delta (float (default=0)) – the minimum reduction in average loss that an epoch must provide in order to qualify as a ‘good’ epoch; Any reduction less than this value is considered to be a bad epoch. Used for early stopping of training, in combination with
patience
.scheduler (Scheduler) – the scheduler object that owns the conditions that will instruct the execution of the Composition If not specified, the Composition will use its automatically generated scheduler.
context – context will be set to self.default_execution_id if unspecified
call_before_minibatch (callable) – called before each minibatch is executed
call_after_minibatch (callable) – called after each minibatch is executed
report_output (ReportOutput : default ReportOutput.OFF) – specifies whether to show output of the Composition and its Nodes trial-by-trial as it is generated; see Output Reporting for additional details and
ReportOutput
for options.report_params (ReportParams : default ReportParams.OFF) – specifies whether to show values the Parameters of the Composition and its Nodes as part of the output report; see Output Reporting for additional details and
ReportParams
for options.report_progress (ReportProgress : default ReportProgress.OFF) – specifies whether to report progress of execution in real time; see Progress Reporting for additional details.
report_simulations (ReportSimulatons : default ReportSimulations.OFF) – specifies whether to show output and/or progress for simulations executed by the Composition’s controller; see Simulations for additional details.
report_to_devices (list(ReportDevices) : default ReportDevices.CONSOLE) – specifies where output and progress should be reported; see
Report_To_Device
for additional details andReportDevices
for options.
- Returns
the results of the final epoch of training
- Return type
list
- execute(inputs=None, num_trials=None, minibatch_size=1, do_logging=False, scheduler=None, termination_processing=None, call_before_minibatch=None, call_after_minibatch=None, call_before_time_step=None, call_before_pass=None, call_after_time_step=None, call_after_pass=None, reset_stateful_functions_to=None, context=None, base_context=<psyneulink.core.globals.context.Context object>, clamp_input='soft_clamp', targets=None, runtime_params=None, execution_mode=ExecutionMode.PyTorch, skip_initialization=False, report_output=ReportOutput.OFF, report_params=ReportParams.OFF, report_progress=ReportProgress.OFF, report_simulations=ReportSimulations.OFF, report_to_devices=None, report=None, report_num=None)¶
Passes inputs to any Nodes receiving inputs directly from the user (via the “inputs” argument) then coordinates with the Scheduler to execute sets of Nodes that are eligible to execute until termination conditions are met.
- Parameters
inputs ({ Node: list } : default None) – a dictionary containing a key-value pair for each Node in the Composition that receives inputs from the user. For each pair, the key is the Node (a Mechanism or Composition) and the value is an input, the shape of which must match the Node’s default variable. If inputs is not specified, the default_variable for each
INPUT
Node is used as its input (see Input Formats for additional details).clamp_input (SOFT_CLAMP : default SOFT_CLAMP) –
runtime_params (Dict[Node: Dict[Parameter: Tuple(Value, Condition)]] : default None) – specifies alternate parameter values to be used only during this
EXECUTION
when the specified Condition is met (see Runtime Parameters for more details and examples of valid dictionaries).skip_initialization (: default False) –
scheduler (Scheduler : default None) – the scheduler object that owns the conditions that will instruct the execution of the Composition If not specified, the Composition will use its automatically generated scheduler.
context (
execution_id
: defaultdefault_execution_id
) – execution context in which the Composition will be executed.base_context (
execution_id
: Context(execution_id=None)) – the context corresponding to the execution context from which this execution will be initialized, if values currently do not exist for context.call_before_time_step (callable : default None) – called before each
TIME_STEP
is executed passed the current context (but it is not necessary for your callable to take).call_after_time_step (callable : default None) – called after each
TIME_STEP
is executed passed the current context (but it is not necessary for your callable to take).call_before_pass (callable : default None) – called before each
PASS
is executed passed the current context (but it is not necessary for your callable to take).call_after_pass (callable : default None) – called after each
PASS
is executed passed the current context (but it is not necessary for your callable to take).execution_mode (enum.Enum[Auto|LLVM|LLVMexec|Python|PTXExec] : default Python) – specifies whether to run using the Python interpreter or a compiled mode. see execution_mode argument of
run
method for additional details.report_output (ReportOutput : default ReportOutput.OFF) – specifies whether to show output of the Composition and its Nodes for the execution; see Output Reporting for additional details and
ReportOutput
for options.report_params (ReportParams : default ReportParams.OFF) – specifies whether to show values the Parameters of the Composition and its Nodes for the execution; see Output Reporting for additional details and
ReportParams
for options.report_progress (ReportProgress : default ReportProgress.OFF) – specifies whether to report progress of the execution; see Progress Reporting for additional details.
report_simulations (ReportSimulations : default ReportSimulations.OFF) – specifies whether to show output and/or progress for simulations executed by the Composition’s controller; see Simulations for additional details.
report_to_devices (list(ReportDevices) : default ReportDevices.CONSOLE) – specifies where output and progress should be reported; see
Report_To_Devices
for additional details andReportDevices
for options.
- Returns
output_values (List)
These are the values of the Composition’s output_CIM.output_ports, excluding those the source of which
are from a (potentially nested) Node with NodeRole.PROBE in its enclosing Composition.
- save(path=None, directory=None, filename=None, context=None)¶
Saves all weight matrices for all MappingProjections in the AutodiffComposition
- Parameters
path (Path, PosixPath or str : default None) – path specification; must be a legal path specification in the filesystem.
directory (str : default
current working directory
) – directory wherematrices
for all MappingProjections in the AutodiffComposition are saved.filename (str : default
<name of AutodiffComposition>_matrix_wts.pnl
) – filename in whichmatrices
for all MappingProjections in the AutodiffComposition are saved.note:: (.) – Matrices are saved in PyTorch state_dict format.
- Returns
- Return type
Path
- load(path=None, directory=None, filename=None, context=None)¶
Loads all weights matrices for all MappingProjections in the AutodiffComposition from file :type path:
Optional
[PosixPath
] :param path: Path for file in which MappingProjectionmatrices
are stored.This must be a legal PosixPath object; if it is specified directory and filename are ignored.
- Parameters
directory (str : default
current working directory
) – directory where MappingProjectionmatrices
are stored.filename (str : default
<name of AutodiffComposition>_matrix_wts.pnl
) – name of file in which MappingProjectionmatrices
are stored.note:: (.) –
Matrices must be stored in PyTorch state_dict format.
- parameters = <psyneulink.library.compositions.autodiffcomposition.AutodiffComposition.Parameters object> : ( execute_until_finished = Parameter( default_value=True delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='execute_until_finished' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), execution_count = Parameter( default_value=0 delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='execution_count' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=False structural=False user=True values={} ), has_initializers = Parameter( default_value=False delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='has_initializers' parse_spec=False pnl_internal=True port=None reference=False setter=<function _has_initializers_setter> specify_none=False stateful=True structural=False user=True values={} ), input_specification = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='input_specification' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=False structural=False user=True values={} ), is_finished_flag = Parameter( default_value=True delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='is_finished_flag' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), learning_rate = Parameter( default_value=0.001 delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='learning_rate' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), losses = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='losses' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), max_executions_before_finished = Parameter( default_value=1000 delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='max_executions_before_finished' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), num_executions = Parameter( default_value=Time(run: 0, trial: 0, pass: 0, time_step: 0) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='num_executions' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), num_executions_before_finished = Parameter( default_value=0 delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='num_executions_before_finished' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), optimizer = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='optimizer' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), pytorch_representation = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='pytorch_representation' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), results = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='results' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), retain_old_simulation_data = Parameter( default_value=False delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='retain_old_simulation_data' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=False structural=False user=True values={} ), simulation_results = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='simulation_results' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), tracked_loss = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='tracked_loss' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), tracked_loss_count = Parameter( default_value=0 delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='tracked_loss_count' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), trial_losses = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='trial_losses' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), value = Parameter( default_value=array([0]) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='value' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), variable = Parameter( constructor_argument='default_variable' default_value=array([0]) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='variable' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), )¶