AutodiffComposition¶
Contents¶
Overview¶
AutodiffComposition is a subclass of Composition for constructing and training feedforward neural network either, using either direct compilation (to LLVM) or automatic conversion to PyTorch, both of which considerably accelerate training (by as much as three orders of magnitude) compared to the standard implementation of learning in a Composition. Although an AutodiffComposition is constructed and executed in much the same way as a standard Composition, it largely restricted to feedforward neural networks using supervised learning, and in particular the the backpropagation learning algorithm. although it can be used for some forms of unsupervised learning that are supported in PyTorch (e.g., self-organized maps).
Creating an AutodiffComposition¶
An AutodiffComposition can be created by calling its constructor, and then adding Components using
the standard Composition methods for doing so (e.g., add_node
,
add_projection
, add_linear_processing_pathway
, etc.). The constructor also includes a number of parameters that are
specific to the AutodiffComposition (see Class Reference for a list of these parameters,
and examples below). While an AutodiffComposition can generally be created using the
same methods as a standard Composition, there are a few restrictions that apply to its construction, summarized below.
Only one OutputPort per Node¶
The Nodes of an AutodiffComposition currently can have only one OutputPort, though that
can have more than one efferent
MappingProjection. Nodes can also have more than one
InputPort, that can receive more than one afferent `path_afferent
Projections.
No Modulatory Components¶
All of the Components in an AutodiffComposition must be able to be subjected to learning, which means that no ModulatoryMechanisms can be included in an AutodiffComposition. Specifically, this precludes any learning components, ControlMechanisms, or a controller.
Learning Components. An AutodiffComposition cannot include any learning components themselves (i.e., LearningMechanisms, LearningSignals, or LearningProjections <LearningProjection>`, nor the ComparatorMechanism
or ObjectiveMechanism used to compute the loss for learning). These are constructed
automatically when learning is executed in Python mode or LLVM mode, and PyTorch-compatible Components are constructed when it is executed in PyTorch mode.
Control Components. An AutodiffComposition also cannot include any ControlMechanisms or a controller. However, it can include Mechanisms that are subject to modulatory control (see Figure, and modulation) by ControlMechanisms outside the Composition, including the controller of a Composition within which the AutodiffComposition is nested. That is, an AutodiffComposition can be nested in a Composition that has such other Components (see Nested Execution and Modulation below).
No Bias Parameters¶
AutodiffComposition does not (currently) support the automatic construction of separate bias parameters.
Thus, when constructing a model using an AutodiffComposition that corresponds to one in PyTorch, the bias
parameter of PyTorch modules should be set
to False
.
Trainable biases can be specified explicitly in an AutodiffComposition by including a ProcessingMechanism that projects to the relevant Mechanism (i.e., implementing that layer of the network to receive the biases) using a MappingProjection with a
matrix
parameter that implements a diagnoal matrix with values corresponding to the initial value of the biases, and setting thedefault_input
Parameter of one of the ProcessingMechanism’sinput_ports
to DEFAULT_VARIABLE, and itsdefault_variable
equal to 1. ProcessingMechanisms configured in this way are assignedNodeRole
BIAS
, and the MappingProjection is subject to learning.
Nesting¶
An AutodiffComposition can be nested inside another Composition for learning, and there can
be any level of such nestings. However, all of the nested Compositions must be AutodiffCompositions. Furthermore, all
nested Compositions use the learning_rate
specified for the outermost Composition,
whether this is specified in the call to its learn
method, its constructor, or its
default value is being used (see learning_rate
below for additional details).
Projections from Nodes in an immediately enclosing outer Composition to the input_CIM
of a nested Composition, and from its output_CIM
to Nodes
in the outer Composition are subject to learning; however those within the nested Composition itself (i.e.,
from its input_CIM to its INPUT Nodes and from its OUTPUT Nodes to its output_CIM) are not subject to learning,
as they serve simply as conduits of information between the outer Composition and the nested one.
Warning
Nested Compositions are supported for learning only in PyTorch mode, and will
cause an error if the learn
method of an AutodiffComposition is executed in
Python mode or LLVM mode.
No Post-construction Modification¶
Mechanisms or Projections should not be added to or deleted from an AutodiffComposition after it has been executed. Unlike an ordinary Composition, AutodiffComposition does not support this functionality.
Execution¶
An AutodiffComposition’s run
, execute
, and learn
methods are the same as for a Composition. However, the execution_mode in the learn
method has different effects than for a standard Composition, that determine whether it uses LLVM compilation or translation to PyTorch to execute learning.
These are each described in greater detail below, and summarized in this table
which provides a comparison of the different modes of execution for an AutodiffComposition and standard Composition.
PyTorch mode¶
# 7/10/24 - FIX: .. _AutodiffComposition_PyTorch_LearningScale:
- ADD DESCRIPTION OF HOW LearningScale SPECIFICATIONS MAP TO EXECUTOIN OF pytorch_rep:
OPTIMIZATION STEP: for AutodiffCompositions, this corresponds to a single call to
foward()
andbackward()
methods of the Pytorch model
This is the default for an AutodiffComposition, but, can be specified explicitly by setting execution_mode =
ExecutionMode.PyTorch
in the learn
method (see example
in Basics and Primer). In this mode, the AutodiffComposition is automatically translated to a PyTorch model for learning. This is comparable in speed to LLVM compilation
, but provides greater flexiblity, including the ability to include nested
AutoDiffCompositions in learning. Although it is best suited for use with supervised learning, it can also be used for some forms of unsupervised learning that are supported in PyTorch (e.g., self-organized maps).
Note
While specifying
ExecutionMode.PyTorch
in thelearn
method of an AutodiffComposition causes it to use PyTorch for training, specifying this in therun
method causes it to be executed using the Python interpreter (and not PyTorch); this is so that any modulation can take effect during execution (see Nested Execution and Modulation below), which is not supported by PyTorch.Warning
Specifying
ExecutionMode.LLVM
orExecutionMode.PyTorch
in the learn() method of a standard Composition causes an error.
LLVM mode¶
This is specified by setting execution_mode = ExecutionMode.LLVMRun
in the learn
method
of an AutodiffCompositon. This provides the fastest performance, but is limited to supervised learning using the BackPropagation
algorithm. This can be run using standard forms of
loss, including mean squared error (MSE) and cross entropy, by specifying this in the loss_spec argument of
the constructor (see AutodiffComposition for additional details, and
Compilation Modes for more information about executing a Composition in compiled mode.
Note
Specifying
ExecutionMode.LLVMRUn
in either thelearn
andrun
methods of an AutodiffComposition causes it to (attempt to) use compiled execution in both cases; this is because LLVM compilation supports the use of modulation in PsyNeuLink models (as compared to PyTorch mode; see note below).
Python mode¶
An AutodiffComposition can also be run using the standard PsyNeuLink learning components. However, this cannot be used if the AutodiffComposition has any nested Compositions, irrespective of whether they are ordinary Compositions or AutodiffCompositions.
Nested Execution and Modulation¶
# FIX:
Like any other Composition, an AutodiffComposition may be nested inside another
(see example below). However, during learning, none of the internal
Components of the AutodiffComposition (e.g., intermediate layers of a neural network model) are accessible to the
other Components of the outer Composition, (e.g., as sources of information, or for modulation). However, when
it is executed using its run
method, then the AutodiffComposition functions like any other,
and all of its internal Components are accessible to other Components of the outer Composition. Thus, as long as access
to its internal Components is not needed during learning, an AutodiffComposition can be trained, and then used to
execute the trained Composition like any other.
Logging¶
Logging in AutodiffCompositions follows the same procedure as logging in a Composition. However, since an AutodiffComposition internally converts all of its Mechanisms either to LLVM or to an equivalent PyTorch model, then its inner components are not actually executed. This means that there is limited support for logging parameters of components inside an AutodiffComposition; Currently, the only supported parameters are:
Examples
The following is an example showing how to create a simple AutodiffComposition, specify its inputs and targets, and run it with learning enabled and disabled:
>>> import psyneulink as pnl
>>> # Set up PsyNeuLink Components
>>> my_mech_1 = pnl.TransferMechanism(function=pnl.Linear, input_shapes = 3)
>>> my_mech_2 = pnl.TransferMechanism(function=pnl.Linear, input_shapes = 2)
>>> my_projection = pnl.MappingProjection(matrix=np.random.randn(3,2),
... sender=my_mech_1,
... receiver=my_mech_2)
>>> # Create AutodiffComposition
>>> my_autodiff = pnl.AutodiffComposition()
>>> my_autodiff.add_node(my_mech_1)
>>> my_autodiff.add_node(my_mech_2)
>>> my_autodiff.add_projection(sender=my_mech_1, projection=my_projection, receiver=my_mech_2)
>>> # Specify inputs and targets
>>> my_inputs = {my_mech_1: [[1, 2, 3]]}
>>> my_targets = {my_mech_2: [[4, 5]]}
>>> input_dict = {"inputs": my_inputs, "targets": my_targets, "epochs": 2}
>>> # Run Composition in learnng mode
>>> my_autodiff.learn(inputs = input_dict)
>>> # Run Composition in test mode
>>> my_autodiff.run(inputs = input_dict['inputs'])
The following shows how the AutodiffComposition created in the previous example can be nested and run inside another Composition:
>>> # Create outer composition
>>> my_outer_composition = pnl.Composition()
>>> my_outer_composition.add_node(my_autodiff)
>>> # Specify dict containing inputs and targets for nested Composition
>>> training_input = {my_autodiff: input_dict}
>>> # Run in learning mode
>>> result1 = my_outer_composition.learn(inputs=training_input)
Class Reference¶
- class psyneulink.library.compositions.autodiffcomposition.AutodiffComposition(pathways=None, optimizer_type='sgd', loss_spec=Loss.MSE, learning_rate=None, weight_decay=0, disable_learning=False, force_no_retain_graph=False, refresh_losses=False, synch_projection_matrices_with_torch='run', synch_node_variables_with_torch=None, synch_node_values_with_torch='run', synch_results_with_torch='run', retain_torch_trained_outputs='minibatch', retain_torch_targets='minibatch', retain_torch_losses='minibatch', device=None, disable_cuda=True, cuda_index=None, name='autodiff_composition', **kwargs)¶
- AutodiffComposition( optimizer_type=’sgd’,
loss_spec=Loss.MSE, weight_decay=0, learning_rate=0.001, disable_learning=False, synch_projection_matrices_with_torch=RUN, synch_node_variables_with_torch=None, synch_node_values_with_torch=RUN, synch_results_with_torch=RUN, retain_torch_trained_outputs=MINIBATCH, retain_torch_targets=MINIBATCH, retain_torch_losses=MINIBATCH, device=CPU )
Subclass of Composition that trains models using either LLVM compilation or PyTorch; see and Composition for additional arguments and attributes. See Composition for additional arguments to constructor.
- Parameters
optimizer_type (str : default 'sgd') – the kind of optimizer used in training. The current options are ‘sgd’ or ‘adam’.
loss_spec (Loss or PyTorch loss function : default Loss.MSE) – specifies the loss function for training; see
Loss
for arguments.weight_decay (float : default 0) – specifies the L2 penalty (which discourages large weights) used by the optimizer.
learning_rate (float : default 0.001) – specifies the learning rate passed to the optimizer if none is specified in the
learn
method of the AutodiffComposition; seelearning_rate
for additional details.disable_learning (bool: default False) – specifies whether the AutodiffComposition should disable learning when run in
learning mode
.synch_projection_matrices_with_torch (
LearningScale
: default RUN) – specifies the default for the AutodiffComposition for when to copy Pytorch parameters to PsyNeuLinkProjection matrices
(connection weights), which can be overridden by specifying the synch_projection_matrices_with_torch argument in thelearn
method; seesynch_projection_matrices_with_torch
for additional details.synch_node_variables_with_torch (
LearningScale
: default None) – specifies the default for the AutodiffComposition for when to copy the current input to Pytorch nodes to the PsyNeuLinkvariable
attribute of the corresponding PsyNeuLinknodes
, which can be overridden by specifying the synch_node_variables_with_torch argument in thelearn
method; seesynch_node_variables_with_torch
for additional details.synch_node_values_with_torch (
LearningScale
: default RUN) – specifies the default for the AutodiffComposition for when to copy the current output of Pytorch nodes to the PsyNeuLinkvalue
attribute of the corresponding PsyNeuLinknodes
, which can be overridden by specifying the synch_node_values_with_torch argument in thelearn
method; seesynch_node_values_with_torch
for additional details.synch_results_with_torch (
LearningScale
: default RUN) – specifies the default for the AutodiffComposition for when to copy the outputs of the Pytorch model to the AutodiffComposition’sresults
attribute, which can be overridden by specifying the synch_results_with_torch argument in thelearn
method. Note that this differs from retain_torch_trained_outputs, which specifies the frequency at which the outputs of the PyTorch model are tracked, all of which are stored in the AutodiffComposition’storch_trained_outputs
attribute at the end of the run; seesynch_results_with_torch
for additional details.retain_torch_trained_outputs (
LearningScale
: default MINIBATCH) – specifies the default for the AutodiffComposition for scale at which the outputs of the Pytorch model are tracked, all of which are stored in the AutodiffComposition’storch_trained_outputs
attribute at the end of the run; this can be overridden by specifying the retain_torch_trained_outputs argument in thelearn
method. Note that this differs from synch_results_with_torch, which specifies the frequency with which values are called to the AutodiffComposition’sresults
attribute; seeretain_torch_trained_outputs
for additional details.retain_torch_targets (
LearningScale
: default MINIBATCH) – specifies the default for the AutodiffComposition for when to copy the targets used for training the Pytorch model to the AutodiffComposition’storch_targets
attribute, which can be overridden by specifying the retain_torch_targets argument in thelearn
method; seeretain_torch_targets
for additional details.retain_torch_losses (
LearningScale
: default MINIBATCH) – specifies the default for the AutodiffComposition for the scale at which the losses of the Pytorch model are tracked, all of which are stored in the AutodiffComposition’storch_losses
attribute at the end of the run; seeretain_torch_losses
for additional details.device (torch.device : default device-dependent) – specifies the device on which the model is run. If None, the device is set to ‘cuda’ if available, then ‘mps`, otherwise ‘cpu’.
- pytorch_representation = None
- optimizer¶
the optimizer used for training. Depends on the optimizer_type, learning_rate, and weight_decay arguments from initialization.
- Type
PyTorch optimizer function
- loss¶
the loss function used for training. Depends on the loss_spec argument from initialization.
- Type
PyTorch loss function
- learning_rate¶
determines the learning_rate passed the optimizer, and is applied to all Projections in the AutodiffComposition that are
learnable
.Note
At present, the same learning rate is applied to all Components of an AutodiffComposition, irrespective of the
learning_rate
that may be specified for any individual Mechanisms or any nested Compositions; in the case of the latter, thelearning_rate
of the outermost AutodiffComposition is used, whether this is specified in the call to itslearn
method, its constructor, or its default value is being used.Hint
To disable updating of a particular MappingProjection in an AutodiffComposition, specify the learnable parameter of its constructor as
False
; this applies to MappingProjections at any level of nesting.- Type
float
- synch_projection_matrices_with_torch¶
determines when to copy PyTorch parameters to PsyNeuLink
Projection matrices
(connection weights) if this is not specified in the call tolearn
. Copying more frequently keeps the PsyNeuLink representation more closely synchronized with parameter updates in Pytorch, but slows performance (seeAutodiffComposition_PyTorch_LearningScale
for information about settings).- Type
OPTIMIZATION_STEP, MINIBATCH, EPOCH or RUN
- synch_node_variables_with_torch¶
determines when to copy the current input to Pytorch nodes (modules) to the PsyNeuLink
variable
attribute of the corresponding PsyNeuLinknodes
, if this is not specified in the call tolearn
. Copying more frequently keeps the PsyNeuLink representation more closely copying more frequently keeps them synchronized with parameter updates in Pytorch, but slows performance (seeAutodiffComposition_PyTorch_LearningScale
for information about settings).- Type
OPTIMIZATION_STEP, TRIAL, MINIBATCH, EPOCH, RUN or None
- synch_node_values_with_torch¶
determines when to copy the current output of Pytorch nodes (modules) to the PsyNeuLink
value
attribute of the corresponding PsyNeuLinknodes
, if this is not specified in the call tolearn
. Copying more frequently keeps the PsyNeuLink representation more closely copying more frequently keeps them synchronized with parameter updates in Pytorch, but slows performance (seeAutodiffComposition_PyTorch_LearningScale
for information about settings).- Type
OPTIMIZATION_STEP, MINIBATCH, EPOCH or RUN
- synch_results_with_torch¶
determines when to copy the current outputs of Pytorch nodes to the PsyNeuLink
results
attribute of the AutodiffComposition if this is not specified in the call tolearn
. Copying more frequently keeps the PsyNeuLink representation more closely synchronized with parameter updates in Pytorch, but slows performance (seeAutodiffComposition_PyTorch_LearningScale
for information about settings).- Type
OPTIMIZATION_STEP, TRIAL, MINIBATCH, EPOCH or RUN
- retain_torch_trained_outputs¶
determines the scale at which the outputs of the Pytorch model are tracked, all of which are stored in the AutodiffComposition’s
results
attribute at the end of the run if this is not specified in the call tolearn <AutodiffComposition.learn>`(see `AutodiffComposition_PyTorch_LearningScale
for information about settings)- Type
OPTIMIZATION_STEP, MINIBATCH, EPOCH, RUN or None
- retain_torch_targets¶
determines the scale at which the targets used for training the Pytorch model are tracked, all of which are stored in the AutodiffComposition’s
targets
attribute at the end of the run if this is not specified in the call tolearn
(seeAutodiffComposition_PyTorch_LearningScale
for information about settings).- Type
OPTIMIZATION_STEP, TRIAL, MINIBATCH, EPOCH, RUN or None
- retain_torch_losses¶
determines the scale at which the losses of the Pytorch model are tracked, all of which are stored in the AutodiffComposition’s
torch_losses
attribute at the end of the run if this is nota specified in the call tolearn
(seeAutodiffComposition_PyTorch_LearningScale
for information about settings).- Type
OPTIMIZATION_STEP, MINIBATCH, EPOCH, RUN or None
- torch_trained_outputs¶
stores the outputs (converted to np arrays) of the Pytorch model trained during learning, at the frequency specified by
retain_torch_trained_outputs
if it is set to MINIBATCH, EPOCH, or RUN; seeretain_torch_trained_outputs
for additional details.- Type
List[ndarray]
- torch_targets¶
stores the targets used for training the Pytorch model during learning at the frequency specified by
retain_torch_targets
if it is set to MINIBATCH, EPOCH, or RUN; seeretain_torch_targets
for additional details.- Type
List[ndarray]
- torch_losses¶
stores the average loss after each weight update (i.e. each minibatch) during learning, at the frequency specified by
retain_torch_trained_outputs
if it is set to MINIBATCH, EPOCH, or RUN; seeretain_torch_losses
for additonal details.- Type
list of floats
- last_saved_weights¶
path for file to which weights were last saved.
- Type
path
- last_loaded_weights¶
path for file from which weights were last loaded.
- Type
path
- device¶
the device on which the model is run.
- Type
torch.device
- class PytorchCompositionWrapper(composition, device, outer_creator=None, context=None)¶
Wrapper for a Composition as a Pytorch Module Class that wraps a Composition as a PyTorch module.
Two main responsibilities:
- Set up parameters of PyTorch model & information required for forward computation:
Handle nested compositions (flattened in infer_backpropagation_learning_pathways): Deal with Projections into and/or out of a nested Composition as shown in figure below:
- (note: Projections in outer Composition to/from a nested Composition’s CIMs are learnable,
and ones in a nested Composition from/to its CIMs are not)
- [ OUTER ][ NESTED ][ OUTER ]
learnable// not learnable// not learnable// learnable//
- —> [Node] —-> [input_CIM] ~~~> [INPUT Node] —-> [OUTPUT Node] ~~~> [output_CIM] —-> [Node] —>
- sndr rcvr nested_rcvr nested_sndr sndr rcvr
^–projection–>^ ^—projection–>^ ^—-PytorchProjectionWrapper—->^ ^—-PytorchProjectionWrapper—->^
ENTRY EXIT
Handle coordination of passing data and outcomes back to PsyNeuLink objects, handled by two main methods:
- synch_with_psyneulink()
Copies matrix weights, node variables, node values, and/or autoutdiff results at user-specified intervals (LearningScale: OPTIMIZATION_STEP, TRIAL, MINIBATCH, EPOCH, RUN); these are specified by the user in the following arguments to run() or learn():
synch_projection_matrices_with_torch=RUN, synch_node_variables_with_torch=None, synch_node_values_with_torch=RUN, synch_results_with_torch=RUN,
and consolidated in the synch_with_pnl_options dict used by synch_with_psyneulink
- retain_for_psyneulink()
Retains learning-specific data used and outcomes generated during execution of PyTorch model (TRAINED_OUTPUT_VALUES, corresponding TARGETS and LOSSES), that are copied to PsyNeuLink at the end of a call to learn(); these are specified by the user in the following arguments to learn():
retain_torch_trained_outputs=MINIBATCH, retain_torch_targets=MINIBATCH, retain_torch_losses=MINIBATCH,
and consolidated in the retain_in_pnl_options dict used by retain_for_psyneulink
- Note: RESULTS is handled in an idiosyncratic way: it is specified along with the synchronization
parameters, since it is a value ordinarily generated in the execution of a Composition; however it’s helper parallels the retain_for_psyneulink helper methods, and it is called from _update_results if TRIAL is specified, in order to integrate with the standard execution of a Composition.
- _composition¶
AutodiffComposition being wrapped.
- Type
- wrapped_nodes¶
list of nodes in the PytorchCompositionWrapper corresponding to PyTorch modules. Generally these are Mechanisms wrapped in a
PytorchMechanismWrapper
, however, if the AutodiffComposition being wrapped is itself a nested Composition, then the wrapped nodes arePytorchCompositionWrapper
objects. When the PyTorch model is executed these are “flattened” into a single PyTorch module, which can be visualized using the AutodiffComposition’sshow_graph
method and setting its show_pytorch argument to True (seePytorchShowGraph
for additional information).- Type
List[PytorchMechanismWrapper]
- nodes_map¶
maps psyneulink Nodes to PytorchCompositionWrapper nodes.
- Type
Dict[Node: PytorchMechanismWrapper or PytorchCompositionWrapper]
- projection_wrappers = List[PytorchProjectionWrapper]
list of PytorchCompositionWrappers in the PytorchCompositionWrapper, each of which wraps a Projection in the AutodiffComposition being wrapped.
- projections_map¶
maps Projections in the AutodiffComposition being wrapped to
PytorchProjectionWrappers
in the PytorchCompositionWrapper.- Type
Dict[Projection: PytorchProjectionWrapper]
- _nodes_to_execute_after_gradient_calc¶
contains nodes specified as
exclude_from_gradient_calc
as keys, and their current variable as values- Type
Dict[node : torch.Tensor]
- optimizer¶
assigned by AutodffComposition after the wrapper is created, which passes the parameters to the optimizer
- Type
torch
- device¶
device used to process torch Tensors in PyTorch modules
- Type
torch.device
- params¶
list of PyTorch parameters (connection weight matrices) in the PyTorch model.
- Type
nn.ParameterList()
- minibatch_loss¶
accumulated loss over all trials (stimuli) within a batch.
- Type
torch.Tensor
- minibatch_loss_count¶
count of losses (trials) within batch, used to calculate average loss per batch.
- Type
int
- retained_results¶
list of the
output_values
of the AutodiffComposition for ever trial executed in a call torun
orlearn
.- Type
List[ndarray]
- retained_trained_outputs¶
values of the trained
OUTPUT
Node (i.e., ones associated withTARGET <NodeRole.TARGET
Node) for each trial executed in a call tolearn
.- Type
List[ndarray]
- retained_targets¶
values of the
TARGET <NodeRole.TARGET
Nodes for each trial executed in a call tolearn
.- Type
List[ndarray]
- retained_losses¶
losses per batch, epoch or run accumulated over a call to learn()
- Type
List[ndarray]
- _regenerate_paramlist()¶
Add Projection matrices to Pytorch Module’s parameter list
- copy_node_values_to_psyneulink(nodes='all', context=None)¶
Copy output of Pytorch nodes to value of AutodiffComposition nodes. IMPLEMENTATION NOTE: list included in nodes arg to allow for future specification of specific nodes to copy
- copy_node_variables_to_psyneulink(nodes='all', context=None)¶
Copy input to Pytorch nodes to variable of AutodiffComposition nodes. IMPLEMENTATION NOTE: list included in nodes arg to allow for future specification of specific nodes to copy
- copy_results_to_psyneulink(current_condition, context=None)¶
Copy outputs of Pytorch forward() to AutodiffComposition.results attribute.
- execute_node(node, variable, optimization_num, context=None)¶
Execute node and store the result in the node’s value attribute Implemented as method (and includes optimization_rep and context as args)
so that it can be overridden by subclasses of PytorchCompositionWrapper
- forward(inputs, optimization_rep, context=None)¶
Forward method of the model for PyTorch and LLVM modes Returns a dictionary {output_node:value} of output values for the model
- Return type
dict
- retain_for_psyneulink(data, retain_in_pnl_options, context)¶
Store outputs, targets, and losses from Pytorch execution for copying to PsyNeuLink at end of learn(). :type data:
dict
:param data: specifies local data available to retain (for copying to pnl at end of run;keys must be one or more of the keywords OUTPUTS, TARGETS, or LOSSES; value must be a torch.Tensor
- Parameters
retain_in_pnl_options (dict) – specifies which data the user has requested be retained (and copied to pnl at end of run) keys must be OUTPUTS, TARGETS, or LOSSES; value must be a LearningScale.name or None (which suppresses copy)
Note (does not actually copy data to pnl; that is done by _getter methods for the relevant autodiff Parameters) –
- retain_losses(loss)¶
Track losses and copy to AutodiffComposition.pytorch_targets at end of learn().
- retain_results(results)¶
Track outputs and copy to AutodiffComposition.pytorch_outputs at end of learn().
- retain_targets(targets)¶
Track targets and copy to AutodiffComposition.pytorch_targets at end of learn().
- retain_trained_outputs(trained_outputs)¶
Track outputs and copy to AutodiffComposition.pytorch_outputs at end of learn().
- synch_with_psyneulink(synch_with_pnl_options, current_condition, context, params=None)¶
Copy weights, values, and/or results from Pytorch to PsyNeuLink at specified junctures params can be used to restrict copy to a specific (set of) param(s). If params is not specified, all are copied;
- pytorch_composition_wrapper_type¶
alias of
psyneulink.library.compositions.pytorchwrappers.PytorchCompositionWrapper
- class Parameters(owner, parent=None)¶
- assign_ShowGraph(show_graph_attributes)¶
Override to replace assignment of ShowGraph class with PytorchShowGraph if torch is available
- infer_backpropagation_learning_pathways(execution_mode, context=None)¶
Create backpropapagation learning pathways for every Input Node –> Output Node pathway Flattens nested compositions:
only includes the Projections in outer Composition to/from the CIMs of the nested Composition (i.e., to input_CIMs and from output_CIMs) – the ones that should be learned;
excludes Projections from/to CIMs in the nested Composition (from input_CIMs and to output_CIMs), as those should remain identity Projections;
see
PytorchCompositionWrapper
for table of how Projections are handled and further details.Returns list of target nodes for each pathway
- Return type
list
- _build_pytorch_representation(context=None, refresh=False)¶
Builds a Pytorch representation of the AutodiffComposition
- autodiff_forward(inputs, targets, synch_with_pnl_options, retain_in_pnl_options, execution_mode, scheduler, context)¶
Perform forward pass of model and compute loss for a single trial (i.e., a single input) in Pytorch mode. Losses are accumulated in pytorch_rep.track_losses, over calls to this method within a minibatch;
at the end of a minibatch, they are averaged and backpropagated by compositionrunner.run_learning() before the next time it calls run(), in a call to backward() by do_gradient_optimization() in _batch_inputs() or _batch_function_inputs(),
- do_gradient_optimization(retain_in_pnl_options, context, optimization_num=None)¶
Compute loss and use in call to autodiff_backward() to compute gradients and update PyTorch parameters. Update parameters (weights) based on trial(s) executed since last optimization, Reinitizalize minibatch_loss and minibatch_loss_count
- autodiff_backward(minibatch_loss, context)¶
Calculate gradients and apply to PyTorch model parameters (weights)
- _get_autodiff_inputs_values(input_dict)¶
Remove TARGET Nodes, and return dict with values of INPUT Nodes for single trial For nested Compositions, replace input to nested Composition with inputs to its INPUT Nodes For InuptPorts, replace with owner
- Returns
- Return type
A dict mapping INPUT Nodes -> input values for a single trial
- _get_autodiff_targets_values(input_dict)¶
Return dict with values for TARGET Nodes Get Inputs to TARGET Nodes used for computation of loss in autodiff_forward(). Uses input_dict to get values for TARGET Nodes that are INPUT Nodes of the AutodiffComposition, If a TARGET Node is not an INPUT Node, it is assumed to be the target of a projection from an INPUT Node and the value is determined by searching recursively for the input Node that projects to the TARGET Node.
- Returns
- Return type
A dict mapping TARGET Nodes -> target values
- _parse_learning_spec(inputs, targets, execution_mode, context)¶
Converts learning inputs and targets to a standardized form
- Returns
dict
– Dict mapping mechanisms to values (with TargetMechanisms inferred from output nodes if needed)int
– Number of input sets in dict for each input node in the Composition
- _identify_target_nodes(context)¶
Recursively call all nested AutodiffCompositions to assign TARGET nodes for learning
- learn(*args, synch_projection_matrices_with_torch=NotImplemented, synch_node_variables_with_torch=NotImplemented, synch_node_values_with_torch=NotImplemented, synch_results_with_torch=NotImplemented, retain_torch_trained_outputs=NotImplemented, retain_torch_targets=NotImplemented, retain_torch_losses=NotImplemented, **kwargs)¶
Override to handle synch and retain args Note: defaults for synch and retain args are set to NotImplemented, so that the user can specify None if
they want to locally override the default values for the AutodiffComposition (see docstrings for run() and _parse_synch_and_retain_args() for additonal details).
- Return type
list
- _get_execution_mode(execution_mode)¶
Parse execution_mode argument and return a valid execution mode for the learn() method Can be overridden by subclasses to change the permitted and/or default execution mode for learning
- execute(inputs=None, num_trials=None, minibatch_size=1, optimizations_per_minibatch=1, do_logging=False, scheduler=None, termination_processing=None, call_before_minibatch=None, call_after_minibatch=None, call_before_time_step=None, call_before_pass=None, call_after_time_step=None, call_after_pass=None, reset_stateful_functions_to=None, context=None, base_context=<psyneulink.core.globals.context.Context object>, clamp_input='soft_clamp', targets=None, runtime_params=None, execution_mode=ExecutionMode.PyTorch, skip_initialization=False, synch_with_pnl_options=None, retain_in_pnl_options=None, report_output=ReportOutput.OFF, report_params=ReportParams.OFF, report_progress=ReportProgress.OFF, report_simulations=ReportSimulations.OFF, report_to_devices=None, report=None, report_num=None)¶
Override to execute autodiff_forward() in learning mode if execute_mode is not Python
- Return type
ndarray
- run(*args, synch_projection_matrices_with_torch=NotImplemented, synch_node_variables_with_torch=NotImplemented, synch_node_values_with_torch=NotImplemented, synch_results_with_torch=NotImplemented, retain_torch_trained_outputs=NotImplemented, retain_torch_targets=NotImplemented, retain_torch_losses=NotImplemented, **kwargs)¶
Override to handle synch and retain args if run called directly from run() rather than learn() Note: defaults for synch and retain args are NotImplemented, so that the user can specify None if they want
to locally override the default values for the AutodiffComposition (see _parse_synch_and_retain_args() for details). This is distinct from the user assigning the Parameter default_values(s), which is done in the AutodiffComposition constructor and handled by the Parameter._specify_none attribute.
- _update_results(results, trial_output, execution_mode, synch_with_pnl_options, context)¶
Update results by appending most recent trial_output This is included as a helper so it can be overriden by subclasses (such as AutodiffComposition) that may need to do this less frequently for scallable exeuction
- save(path=None, directory=None, filename=None, context=None)¶
Saves all weight matrices for all MappingProjections in the AutodiffComposition
- Parameters
path (Path, PosixPath or str : default None) – path specification; must be a legal path specification in the filesystem.
directory (str : default
current working directory
) – directory wherematrices
for all MappingProjections in the AutodiffComposition are saved.filename (str : default
<name of AutodiffComposition>_matrix_wts.pnl
) – filename in whichmatrices
for all MappingProjections in the AutodiffComposition are saved.note:: (.) – Matrices are saved in PyTorch state_dict format.
- Returns
- Return type
Path
- load(path=None, directory=None, filename=None, context=None)¶
Loads all weight matrices for all MappingProjections in the AutodiffComposition from file :type path:
PosixPath
:param path: Path for file in which MappingProjectionmatrices
are stored.This must be a legal PosixPath object; if it is specified directory and filename are ignored.
- Parameters
directory (str : default
current working directory
) – directory where MappingProjectionmatrices
are stored.filename (str : default
<name of AutodiffComposition>_matrix_wts.pnl
) – name of file in which MappingProjectionmatrices
are stored.note:: (.) –
Matrices must be stored in PyTorch state_dict format.
- parameters = <psyneulink.library.compositions.autodiffcomposition.AutodiffComposition.Parameters object> : ( device = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='device' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), execute_until_finished = Parameter( default_value=True delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='execute_until_finished' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), execution_count = Parameter( default_value=array(0) delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='execution_count' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=False structural=False user=True values={} ), has_initializers = Parameter( default_value=False delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='has_initializers' parse_spec=False pnl_internal=True port=None reference=False setter=<function _has_initializers_setter> specify_none=False stateful=True structural=False user=True values={} ), input_specification = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='input_specification' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=False structural=False user=True values={} ), is_finished_flag = Parameter( default_value=True delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='is_finished_flag' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), learning_rate = Parameter( default_value=array(0.001) delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='learning_rate' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), learning_results = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='learning_results' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), max_executions_before_finished = Parameter( default_value=array(1000) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='max_executions_before_finished' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), minibatch_size = Parameter( default_value=array(1) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None modulable=True modulation_combination_function=None name='minibatch_size' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), num_executions = Parameter( default_value=Time(run: 0, trial: 0, pass: 0, time_step: 0) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='num_executions' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), num_executions_before_finished = Parameter( default_value=array(0) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='num_executions_before_finished' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), optimizations_per_minibatch = Parameter( default_value=array(1) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None modulable=True modulation_combination_function=None name='optimizations_per_minibatch' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), optimizer = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='optimizer' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), pytorch_representation = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='pytorch_representation' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), results = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='results' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), retain_old_simulation_data = Parameter( default_value=False delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='retain_old_simulation_data' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=False structural=False user=True values={} ), retain_torch_losses = Parameter( default_value='minibatch' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='retain_torch_losses' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), retain_torch_targets = Parameter( default_value='minibatch' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='retain_torch_targets' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), retain_torch_trained_outputs = Parameter( default_value='minibatch' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='retain_torch_trained_outputs' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), simulation_results = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='simulation_results' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), synch_node_values_with_torch = Parameter( default_value='run' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='synch_node_values_with_torch' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), synch_node_variables_with_torch = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='synch_node_variables_with_torch' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), synch_projection_matrices_with_torch = Parameter( default_value='run' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='synch_projection_matrices_with_torch' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), synch_results_with_torch = Parameter( default_value='run' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='synch_results_with_torch' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), torch_losses = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True getter=<function _get_torch_losses> history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='torch_losses' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), torch_targets = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True getter=<function _get_torch_targets> history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='torch_targets' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), torch_trained_outputs = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True getter=<function _get_torch_trained_outputs> history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='torch_trained_outputs' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), trial_losses = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='trial_losses' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), value = Parameter( default_value=NotImplemented delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='value' parse_spec=False pnl_internal=False port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), variable = Parameter( constructor_argument='default_variable' default_value=array([0]) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='variable' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), )¶
- show_graph(*args, **kwargs)¶
Override to use PytorchShowGraph if show_pytorch is True