AutodiffComposition¶

Contents¶

Overview

Creating an AutodiffComposition

AutodiffComposition

No Modulatory Components

No Bias Parameters

Nesting

No Post-construction Modification

Execution

PyTorch mode

LLVM mode

Python mode

Nested Execution and Modulation

Logging

Examples

Class Reference

Overview¶

AutodiffComposition is a subclass of Composition for constructing and training feedforward neural network either, using either direct compilation (to LLVM) or automatic conversion to PyTorch, both of which considerably accelerate training (by as much as three orders of magnitude) compared to the standard implementation of learning in a Composition. Although an AutodiffComposition is constructed and executed in much the same way as a standard Composition, it largely restricted to feedforward neural networks using supervised learning, and in particular the the backpropagation learning algorithm. although it can be used for some forms of unsupervised learning that are supported in PyTorch (e.g., self-organized maps).

Creating an AutodiffComposition¶

An AutodiffComposition can be created by calling its constructor, and then adding Components using the standard Composition methods for doing so (e.g., add_node, add_projection, add_linear_processing_pathway, etc.). The constructor also includes a number of parameters that are specific to the AutodiffComposition (see Class Reference for a list of these parameters, and examples below). While an AutodiffComposition can generally be created using the same methods as a standard Composition, there are a few restrictions that apply to its construction, summarized below.

Only one OutputPort per Node¶

The Nodes of an AutodiffComposition currently can have only one OutputPort, though that can have more than one efferent MappingProjection. Nodes can also have more than one InputPort, that can receive more than one afferent `path_afferent Projections.

No Modulatory Components¶

All of the Components in an AutodiffComposition must be able to be subjected to learning, which means that no ModulatoryMechanisms can be included in an AutodiffComposition. Specifically, this precludes any learning components, ControlMechanisms, or a controller.

Learning Components. An AutodiffComposition cannot include any learning components themselves (i.e., LearningMechanisms, LearningSignals, or LearningProjections, nor the ComparatorMechanism or ObjectiveMechanism used to compute the loss for learning). These are constructed automatically when learning is executed in Python mode or LLVM mode, and PyTorch-compatible Components are constructed when it is executed in PyTorch mode.

Control Components. An AutodiffComposition also cannot include any ControlMechanisms or a controller. However, it can include Mechanisms that are subject to modulatory control (see Figure, and modulation) by ControlMechanisms outside the Composition, including the controller of a Composition within which the AutodiffComposition is nested. That is, an AutodiffComposition can be nested in a Composition that has other such Components (see Nested Execution and Modulation below).

No Bias Parameters¶

AutodiffComposition does not (currently) support the automatic construction of separate bias parameters. Thus, when constructing the PyTorch version of an AutodiffComposition, the bias parameter of any PyTorch modules are set to False. However, biases can be implemented using BIAS Nodes.

Nesting¶

An AutodiffComposition can be nested inside another Composition for learning, and there can be any level of such nestings. However, all of the nested Compositions must be AutodiffCompositions. Furthermore, all nested Compositions use the learning_rate specified for the outermost Composition, whether this is specified in the call to its learn method, its constructor, or its default value is being used (see learning_rate below for additional details).

Projections from Nodes in an immediately enclosing outer Composition to the input_CIM of a nested Composition, and from its output_CIM to Nodes in the outer Composition are subject to learning; however those within the nested Composition itself (i.e., from its input_CIM to its INPUT Nodes and from its OUTPUT Nodes to its output_CIM) are not subject to learning, as they serve simply as conduits of information between the outer Composition and the nested one.

Warning

Nested Compositions are supported for learning only in PyTorch mode, and will cause an error if the learn method of an AutodiffComposition is executed in Python mode or LLVM mode.

No Post-construction Modification¶

Mechanisms or Projections should not be added to or deleted from an AutodiffComposition after it has been executed. Unlike an ordinary Composition, AutodiffComposition does not support this functionality.

Execution¶

An AutodiffComposition’s run, execute, and learn methods are the same as for a Composition. However, the execution_mode in the learn method has different effects than for a standard Composition, that determine whether it uses LLVM compilation or translation to PyTorch to execute learning. These are each described in greater detail below, and summarized in this table which provides a comparison of the different modes of execution for an AutodiffComposition and standard Composition.

PyTorch mode¶

# 7/10/24 - FIX: .. _AutodiffComposition_PyTorch_LearningScale:

ADD DESCRIPTION OF HOW LearningScale SPECIFICATIONS MAP TO EXECUTOIN OF pytorch_rep:
OPTIMIZATION STEP: for AutodiffCompositions, this corresponds to a single call to foward() and backward()

methods of the Pytorch model

This is the default for an AutodiffComposition, but, can be specified explicitly by setting execution_mode = ExecutionMode.PyTorch in the learn method (see example in Basics and Primer). In this mode, the AutodiffComposition is automatically translated to a PyTorch model for learning. This is comparable in speed to LLVM compilation, but provides greater flexiblity, including the ability to include nested AutoDiffCompositions in learning. Although it is best suited for use with supervised learning, it can also be used for some forms of unsupervised learning that are supported in PyTorch (e.g., self-organized maps).

Note

While specifying ExecutionMode.PyTorch in the learn method of an AutodiffComposition causes it to use PyTorch for training, specifying this in the run method causes it to be executed using the Python interpreter (and not PyTorch); this is so that any modulation can take effect during execution (see Nested Execution and Modulation below), which is not supported by PyTorch.

Warning

Specifying ExecutionMode.LLVMRun or ExecutionMode.PyTorch in the learn() method of a standard Composition causes an error.

LLVM mode¶

This is specified by setting execution_mode = ExecutionMode.LLVMRun in the learn method of an AutodiffCompositon. This provides the fastest performance, but is limited to supervised learning using the BackPropagation algorithm. This can be run using standard forms of loss, including mean squared error (MSE) and cross entropy, by specifying this in the loss_spec argument of the constructor (see AutodiffComposition for additional details, and Compilation Modes for more information about executing a Composition in compiled mode.

Note

Specifying ExecutionMode.LLVMRun in either the learn and run methods of an AutodiffComposition causes it to (attempt to) use compiled execution in both cases; this is because LLVM compilation supports the use of modulation in PsyNeuLink models (as compared to PyTorch mode; see note below).

Python mode¶

An AutodiffComposition can also be run using the standard PsyNeuLink learning components. However, this cannot be used if the AutodiffComposition has any nested Compositions, irrespective of whether they are ordinary Compositions or AutodiffCompositions.

Nested Execution and Modulation¶

# FIX: Like any other Composition, an AutodiffComposition may be nested inside another (see example below). However, during learning, none of the internal Components of the AutodiffComposition (e.g., intermediate layers of a neural network model) are accessible to the other Components of the outer Composition, (e.g., as sources of information, or for modulation). However, when it is executed using its run method, then the AutodiffComposition functions like any other, and all of its internal Components are accessible to other Components of the outer Composition. Thus, as long as access to its internal Components is not needed during learning, an AutodiffComposition can be trained, and then used to execute the trained Composition like any other.

Logging¶

Logging in AutodiffCompositions follows the same procedure as logging in a Composition. However, since an AutodiffComposition internally converts all of its Mechanisms either to LLVM or to an equivalent PyTorch model, then its inner components are not actually executed. This means that there is limited support for logging parameters of components inside an AutodiffComposition; Currently, the only supported parameters are:

the matrix parameter of Projections
the value parameter of its inner components

Examples

The following is an example showing how to create a simple AutodiffComposition, specify its inputs and targets, and run it with learning enabled and disabled:

>>> import psyneulink as pnl
>>> # Set up PsyNeuLink Components
>>> my_mech_1 = pnl.TransferMechanism(function=pnl.Linear, input_shapes = 3)
>>> my_mech_2 = pnl.TransferMechanism(function=pnl.Linear, input_shapes = 2)
>>> my_projection = pnl.MappingProjection(matrix=np.random.randn(3,2),
...                     sender=my_mech_1,
...                     receiver=my_mech_2)
>>> # Create AutodiffComposition
>>> my_autodiff = pnl.AutodiffComposition()
>>> my_autodiff.add_node(my_mech_1)
>>> my_autodiff.add_node(my_mech_2)
>>> my_autodiff.add_projection(sender=my_mech_1, projection=my_projection, receiver=my_mech_2)
>>> # Specify inputs and targets
>>> my_inputs = {my_mech_1: [[1, 2, 3]]}
>>> my_targets = {my_mech_2: [[4, 5]]}
>>> input_dict = {"inputs": my_inputs, "targets": my_targets, "epochs": 2}
>>> # Run Composition in learnng mode
>>> my_autodiff.learn(inputs = input_dict)
>>> # Run Composition in test mode
>>> my_autodiff.run(inputs = input_dict['inputs'])

The following shows how the AutodiffComposition created in the previous example can be nested and run inside another Composition:

>>> # Create outer composition
>>> my_outer_composition = pnl.Composition()
>>> my_outer_composition.add_node(my_autodiff)
>>> # Specify dict containing inputs and targets for nested Composition
>>> training_input = {my_autodiff: input_dict}
>>> # Run in learning mode
>>> result1 = my_outer_composition.learn(inputs=training_input)

Class Reference¶

class psyneulink.library.compositions.autodiffcomposition.AutodiffComposition(pathways=None, optimizer_type='sgd', loss_spec=Loss.MSE, learning_rate=None, weight_decay=0, disable_learning=False, force_no_retain_graph=False, refresh_losses=False, synch_projection_matrices_with_torch='run', synch_node_variables_with_torch=None, synch_node_values_with_torch='run', synch_results_with_torch='run', retain_torch_trained_outputs='minibatch', retain_torch_targets='minibatch', retain_torch_losses='minibatch', device=None, disable_cuda=True, cuda_index=None, name='autodiff_composition', **kwargs)¶

AutodiffComposition( optimizer_type=’sgd’,: loss_spec=Loss.MSE, weight_decay=0, learning_rate=0.001, disable_learning=False, synch_projection_matrices_with_torch=RUN, synch_node_variables_with_torch=None, synch_node_values_with_torch=RUN, synch_results_with_torch=RUN, retain_torch_trained_outputs=MINIBATCH, retain_torch_targets=MINIBATCH, retain_torch_losses=MINIBATCH, device=CPU )

Subclass of Composition that trains models using either LLVM compilation or PyTorch; see and Composition for additional arguments and attributes. See Composition for additional arguments to constructor.

Parameters

optimizer_type (str : default 'sgd') – the kind of optimizer used in training. The current options are ‘sgd’ or ‘adam’.
loss_spec (Loss or PyTorch loss function : default Loss.MSE) – specifies the loss function for training; see Loss for arguments.
weight_decay (float : default 0) – specifies the L2 penalty (which discourages large weights) used by the optimizer.
learning_rate (float : default 0.001) – specifies the learning rate passed to the optimizer if none is specified in the learn method of the AutodiffComposition; see learning_rate for additional details.
disable_learning (bool: default False) – specifies whether the AutodiffComposition should disable learning when run in learning mode.
synch_projection_matrices_with_torch (LearningScale : default RUN) – specifies the default for the AutodiffComposition for when to copy Pytorch parameters to PsyNeuLink Projection matrices (connection weights), which can be overridden by specifying the synch_projection_matrices_with_torch argument in the learn method; see synch_projection_matrices_with_torch for additional details.
synch_node_variables_with_torch (LearningScale : default None) – specifies the default for the AutodiffComposition for when to copy the current input to Pytorch nodes to the PsyNeuLink variable attribute of the corresponding PsyNeuLink nodes, which can be overridden by specifying the synch_node_variables_with_torch argument in the learn method; see synch_node_variables_with_torch for additional details.
synch_node_values_with_torch (LearningScale : default RUN) – specifies the default for the AutodiffComposition for when to copy the current output of Pytorch nodes to the PsyNeuLink value attribute of the corresponding PsyNeuLink nodes, which can be overridden by specifying the synch_node_values_with_torch argument in the learn method; see synch_node_values_with_torch for additional details.
synch_results_with_torch (LearningScale : default RUN) – specifies the default for the AutodiffComposition for when to copy the outputs of the Pytorch model to the AutodiffComposition’s results attribute, which can be overridden by specifying the synch_results_with_torch argument in the learn method. Note that this differs from retain_torch_trained_outputs, which specifies the frequency at which the outputs of the PyTorch model are tracked, all of which are stored in the AutodiffComposition’s torch_trained_outputs attribute at the end of the run; see synch_results_with_torch for additional details.
retain_torch_trained_outputs (LearningScale : default MINIBATCH) – specifies the default for the AutodiffComposition for scale at which the outputs of the Pytorch model are tracked, all of which are stored in the AutodiffComposition’s torch_trained_outputs attribute at the end of the run; this can be overridden by specifying the retain_torch_trained_outputs argument in the learn method. Note that this differs from synch_results_with_torch, which specifies the frequency with which values are called to the AutodiffComposition’s results attribute; see retain_torch_trained_outputs for additional details.
retain_torch_targets (LearningScale : default MINIBATCH) – specifies the default for the AutodiffComposition for when to copy the targets used for training the Pytorch model to the AutodiffComposition’s torch_targets attribute, which can be overridden by specifying the retain_torch_targets argument in the learn method; see retain_torch_targets for additional details.
retain_torch_losses (LearningScale : default MINIBATCH) – specifies the default for the AutodiffComposition for the scale at which the losses of the Pytorch model are tracked, all of which are stored in the AutodiffComposition’s torch_losses attribute at the end of the run; see retain_torch_losses for additional details.
device (torch.device : default device-dependent) – specifies the device on which the model is run. If None, the device is set to ‘cuda’ if available, then ‘mps`, otherwise ‘cpu’.

pytorch_representation = None

optimizer¶

the optimizer used for training. Depends on the optimizer_type, learning_rate, and weight_decay arguments from initialization.

Type: PyTorch optimizer function

loss¶

the loss function used for training. Depends on the loss_spec argument from initialization.

Type: PyTorch loss function

learning_rate¶

determines the learning_rate passed the optimizer, and is applied to all Projections in the AutodiffComposition that are learnable.

Note

At present, the same learning rate is applied to all Components of an AutodiffComposition, irrespective of the learning_rate that may be specified for any individual Mechanisms or any nested Compositions; in the case of the latter, the learning_rate of the outermost AutodiffComposition is used, whether this is specified in the call to its learn method, its constructor, or its default value is being used.

Hint

To disable updating of a particular MappingProjection in an AutodiffComposition, specify the learnable parameter of its constructor as False; this applies to MappingProjections at any level of nesting.

Type: float

synch_projection_matrices_with_torch¶

determines when to copy PyTorch parameters to PsyNeuLink Projection matrices (connection weights) if this is not specified in the call to learn. Copying more frequently keeps the PsyNeuLink representation more closely synchronized with parameter updates in Pytorch, but slows performance (see AutodiffComposition_PyTorch_LearningScale for information about settings).

Type: OPTIMIZATION_STEP, MINIBATCH, EPOCH or RUN

synch_node_variables_with_torch¶

determines when to copy the current input to Pytorch functions to the PsyNeuLink variable attribute of the corresponding PsyNeuLink nodes, if this is not specified in the call to learn. Copying more frequently keeps the PsyNeuLink representation more closely copying more frequently keeps them synchronized with parameter updates in Pytorch, but slows performance (see AutodiffComposition_PyTorch_LearningScale for information about settings).

Type: OPTIMIZATION_STEP, TRIAL, MINIBATCH, EPOCH, RUN or None

synch_node_values_with_torch¶

Type: OPTIMIZATION_STEP, MINIBATCH, EPOCH or RUN

determines when to copy the current output of Pytorch functions to the PsyNeuLink `value <Mechanism_Base.value>`: attribute of the corresponding PsyNeuLink nodes, if this is not specified in the call to learn. Copying more frequently keeps the PsyNeuLink representation more closely copying more frequently keeps them synchronized with parameter updates in Pytorch, but slows performance (see AutodiffComposition_PyTorch_LearningScale for information about settings).

synch_results_with_torch¶

determines when to copy the current outputs of Pytorch nodes to the PsyNeuLink results attribute of the AutodiffComposition if this is not specified in the call to learn. Copying more frequently keeps the PsyNeuLink representation more closely synchronized with parameter updates in Pytorch, but slows performance (see AutodiffComposition_PyTorch_LearningScale for information about settings).

Type: OPTIMIZATION_STEP, TRIAL, MINIBATCH, EPOCH or RUN

retain_torch_trained_outputs¶

determines the scale at which the outputs of the Pytorch model are tracked, all of which are stored in the AutodiffComposition’s results attribute at the end of the run if this is not specified in the call to learn <AutodiffComposition.learn>`(see `AutodiffComposition_PyTorch_LearningScale for information about settings)

Type: OPTIMIZATION_STEP, MINIBATCH, EPOCH, RUN or None

retain_torch_targets¶

determines the scale at which the targets used for training the Pytorch model are tracked, all of which are stored in the AutodiffComposition’s targets attribute at the end of the run if this is not specified in the call to learn (see AutodiffComposition_PyTorch_LearningScale for information about settings).

Type: OPTIMIZATION_STEP, TRIAL, MINIBATCH, EPOCH, RUN or None

retain_torch_losses¶

determines the scale at which the losses of the Pytorch model are tracked, all of which are stored in the AutodiffComposition’s torch_losses attribute at the end of the run if this is nota specified in the call to learn (see AutodiffComposition_PyTorch_LearningScale for information about settings).

Type: OPTIMIZATION_STEP, MINIBATCH, EPOCH, RUN or None

torch_trained_outputs¶

stores the outputs (converted to np arrays) of the Pytorch model trained during learning, at the frequency specified by retain_torch_trained_outputs if it is set to MINIBATCH, EPOCH, or RUN; see retain_torch_trained_outputs for additional details.

Type: List[ndarray]

torch_targets¶

stores the targets used for training the Pytorch model during learning at the frequency specified by retain_torch_targets if it is set to MINIBATCH, EPOCH, or RUN; see retain_torch_targets for additional details.

Type: List[ndarray]

torch_losses¶

stores the average loss after each weight update (i.e. each minibatch) during learning, at the frequency specified by retain_torch_trained_outputs if it is set to MINIBATCH, EPOCH, or RUN; see retain_torch_losses for additonal details.

Type: list of floats

last_saved_weights¶

path for file to which weights were last saved.

Type: path

last_loaded_weights¶

path for file from which weights were last loaded.

Type: path

device¶

the device on which the model is run.

Type: torch.device

class PytorchCompositionWrapper(composition, device, outer_creator=None, context=None)¶

Wrapper for a Composition as a Pytorch Module.

Wraps an AutodiffComposition as a PyTorch module, with each Mechanism in the AutodiffComposition wrapped as a PytorchMechanismWrapper, each Projection wrapped as a PytorchProjectionWrapper, and any nested Compositions wrapped as PytorchCompositionWrappers. Each PytorchMechanismWrapper implements a Pytorch version of the function(s) of the wrapped Mechanism, which are executed in the PyTorchCompositionWrapper’s forward method in the order specified by the AutodiffComposition’s scheduler. The matrix Parameters of each wrapped Projection are assigned as parameters of the PytorchMechanismWrapper Pytorch module and used, together with a Pytorch matmul operation, to generate the input to each PyTorch function as specified by the PytorchProjectionWrapper's graph. The graph can be visualized using the AutodiffComposition’s show_graph method and setting its show_pytorch argument to True (see PytorchShowGraph for additional information).

Two main responsibilities:

Set up functions and parameters of PyTorch module required for it forward computation:
Handle nested compositions (flattened in infer_backpropagation_learning_pathways): Deal with Projections into and/or out of a nested Composition as shown in figure below:

(note: Projections in outer Composition to/from a nested Composition’s CIMs are learnable,
and ones in a nested Composition from/to its CIMs are not)

[ OUTER ][ NESTED ][ OUTER ]
learnable// not learnable// not learnable// learnable//

—> [Node] —-> [input_CIM] ~~~> [INPUT Node] —-> [OUTPUT Node] ~~~> [output_CIM] —-> [Node] —>

sndr rcvr nested_rcvr nested_sndr sndr rcvr
^–projection–>^ ^—projection–>^ ^—-PytorchProjectionWrapper—->^ ^—-PytorchProjectionWrapper—->^

ENTRY EXIT
Handle coordination of passing data and outcomes back to PsyNeuLink objects, handled by two main methods:
- synch_with_psyneulink()
  Copies matrix weights, node variables, node values, and/or autoutdiff results at user-specified intervals (LearningScale: OPTIMIZATION_STEP, TRIAL, MINIBATCH, EPOCH, RUN); these are specified by the user in the following arguments to run() or learn():
  
  synch_projection_matrices_with_torch=RUN, synch_node_variables_with_torch=None, synch_node_values_with_torch=RUN, synch_results_with_torch=RUN,
  
  and consolidated in the synch_with_pnl_options dict used by synch_with_psyneulink
- retain_for_psyneulink()
  Retains learning-specific data used and outcomes generated during execution of PyTorch model (TRAINED_OUTPUT_VALUES, corresponding TARGETS and LOSSES), that are copied to PsyNeuLink at the end of a call to learn(); these are specified by the user in the following arguments to learn():
  
  retain_torch_trained_outputs=MINIBATCH, retain_torch_targets=MINIBATCH, retain_torch_losses=MINIBATCH,
  
  and consolidated in the retain_in_pnl_options dict used by retain_for_psyneulink
- Note: RESULTS is handled in an idiosyncratic way: it is specified along with the synchronization
  parameters, since it is a value ordinarily generated in the execution of a Composition; however it’s helper parallels the retain_for_psyneulink helper methods, and it is called from _update_results if TRIAL is specified, in order to integrate with the standard execution of a Composition.

_composition¶

AutodiffComposition being wrapped.

Type: Composition

wrapped_nodes¶

list of nodes in the PytorchCompositionWrapper corresponding to the PyTorch functions that comprise the forward method of the Pytorch module implemented by the PytorchCompositionWrapper. Generally these are Mechanisms wrapped in a PytorchMechanismWrapper, however, if the AutodiffComposition Node being wrapped is a nested Composition, then the wrapped node is itself a PytorchCompositionWrapper object. When the PyTorch model is executed, all of these are “flattened” into a single PyTorch module, corresponding to the outermost AutodiffComposition being wrapped, which can be visualized using that AutodiffComposition’s show_graph method and setting its show_pytorch argument to True (see PytorchShowGraph for additional information).

Type: List[PytorchMechanismWrapper]

nodes_map¶

maps psyneulink Nodes to PytorchCompositionWrapper nodes.

Type: Dict[Node: PytorchMechanismWrapper or PytorchCompositionWrapper]

projection_wrappers = List[PytorchProjectionWrapper]: list of PytorchCompositionWrappers in the PytorchCompositionWrapper, each of which wraps a Projection in the AutodiffComposition being wrapped.

projections_map¶

maps Projections in the AutodiffComposition being wrapped to PytorchProjectionWrappers in the PytorchCompositionWrapper.

Type: Dict[Projection: PytorchProjectionWrapper]

_nodes_to_execute_after_gradient_calc¶

contains nodes specified as exclude_from_gradient_calc as keys, and their current variable as values

Type: Dict[node : torch.Tensor]

optimizer¶

assigned by AutodffComposition after the wrapper is created, which passes the parameters to the optimizer

Type: torch

device¶

device used to process torch Tensors in PyTorch functions

Type: torch.device

params¶

list of PyTorch parameters (connection weight matrices) in the PyTorch model.

Type: nn.ParameterList()

minibatch_loss¶

accumulated loss over all trials (stimuli) within a batch.

Type: torch.Tensor

minibatch_loss_count¶

count of losses (trials) within batch, used to calculate average loss per batch.

Type: int

retained_results¶

list of the output_values of the AutodiffComposition for ever trial executed in a call to run or learn.

Type: List[ndarray]

retained_trained_outputs¶

values of the trained OUTPUT Node (i.e., ones associated with TARGET <NodeRole.TARGET Node) for each trial executed in a call to learn.

Type: List[ndarray]

retained_targets¶

values of the TARGET <NodeRole.TARGET Nodes for each trial executed in a call to learn.

Type: List[ndarray]

retained_losses¶

losses per batch, epoch or run accumulated over a call to learn()

Type: List[ndarray]

_regenerate_paramlist()¶: Add Projection matrices to Pytorch Module’s parameter list

copy_node_values_to_psyneulink(nodes='all', context=None)¶: Copy output of Pytorch nodes to value of AutodiffComposition nodes. IMPLEMENTATION NOTE: list included in nodes arg to allow for future specification of specific nodes to copy

copy_node_variables_to_psyneulink(nodes='all', context=None)¶: Copy input to Pytorch nodes to variable of AutodiffComposition nodes. IMPLEMENTATION NOTE: list included in nodes arg to allow for future specification of specific nodes to copy

copy_results_to_psyneulink(current_condition, context=None)¶: Copy outputs of Pytorch forward() to AutodiffComposition.results attribute.

execute_node(node, variable, optimization_num, context=None)¶: Execute node and store the result in the node’s value attribute Implemented as method (and includes optimization_rep and context as args)

so that it can be overridden by subclasses of PytorchCompositionWrapper

forward(inputs, optimization_rep, context=None)¶

Forward method of the model for PyTorch and LLVM modes Returns a dictionary {output_node:value} of output values for the model

Return type: dict

retain_for_psyneulink(data, retain_in_pnl_options, context)¶

Store outputs, targets, and losses from Pytorch execution for copying to PsyNeuLink at end of learn(). :type data: dict :param data: specifies local data available to retain (for copying to pnl at end of run;

keys must be one or more of the keywords OUTPUTS, TARGETS, or LOSSES; value must be a torch.Tensor

Parameters

retain_in_pnl_options (dict) – specifies which data the user has requested be retained (and copied to pnl at end of run) keys must be OUTPUTS, TARGETS, or LOSSES; value must be a LearningScale.name or None (which suppresses copy)
Note (does not actually copy data to pnl; that is done by _getter methods for the relevant autodiff Parameters) –

retain_losses(loss)¶: Track losses and copy to AutodiffComposition.pytorch_targets at end of learn().

retain_results(results)¶: Track outputs and copy to AutodiffComposition.pytorch_outputs at end of learn().

retain_targets(targets)¶: Track targets and copy to AutodiffComposition.pytorch_targets at end of learn().

retain_trained_outputs(trained_outputs)¶: Track outputs and copy to AutodiffComposition.pytorch_outputs at end of learn().

synch_with_psyneulink(synch_with_pnl_options, current_condition, context, params=None)¶: Copy weights, values, and/or results from Pytorch to PsyNeuLink at specified junctures params can be used to restrict copy to a specific (set of) param(s). If params is not specified, all are copied;

pytorch_composition_wrapper_type¶: alias of psyneulink.library.compositions.pytorchwrappers.PytorchCompositionWrapper

class Parameters(owner, parent=None)¶

assign_ShowGraph(show_graph_attributes)¶: Override to replace assignment of ShowGraph class with PytorchShowGraph if torch is available

infer_backpropagation_learning_pathways(execution_mode, context=None)¶

Create backpropapagation learning pathways for every Input Node –> Output Node pathway Flattens nested compositions:

only includes the Projections in outer Composition to/from the CIMs of the nested Composition (i.e., to input_CIMs and from output_CIMs) – the ones that should be learned;

excludes Projections from/to CIMs in the nested Composition (from input_CIMs and to output_CIMs), as those should remain identity Projections;

see PytorchCompositionWrapper for table of how Projections are handled and further details.

Returns list of target nodes for each pathway

Return type: list

_build_pytorch_representation(context=None, refresh=False)¶: Builds a Pytorch representation of the AutodiffComposition

autodiff_forward(inputs, targets, synch_with_pnl_options, retain_in_pnl_options, execution_mode, scheduler, context)¶: Perform forward pass of model and compute loss for a batch of trials in Pytorch mode. Losses are then accumulated, error is backpropagated by compositionrunner.run_learning()

before the next time it calls run(), in a call to backward() by do_gradient_optimization() in _batch_inputs() or _batch_function_inputs(),

do_gradient_optimization(retain_in_pnl_options, context, optimization_num=None)¶: Compute loss and use in call to autodiff_backward() to compute gradients and update PyTorch parameters. Update parameters (weights) based on trial(s) executed since last optimization, Reinitizalize minibatch_loss and minibatch_loss_count

autodiff_backward(minibatch_loss, context)¶: Calculate gradients and apply to PyTorch model parameters (weights)

_get_autodiff_inputs_values(input_dict)¶

Remove TARGET Nodes, and return dict with values of INPUT Nodes for single trial For nested Compositions, replace input to nested Composition with inputs to its INPUT Nodes For InuptPorts, replace with owner

Returns
Return type: A dict mapping INPUT Nodes -> input values for a single trial

_get_autodiff_targets_values(input_dict)¶

Return dict with values for TARGET Nodes Get Inputs to TARGET Nodes used for computation of loss in autodiff_forward(). Uses input_dict to get values for TARGET Nodes that are INPUT Nodes of the AutodiffComposition, If a TARGET Node is not an INPUT Node, it is assumed to be the target of a projection from an INPUT Node and the value is determined by searching recursively for the input Node that projects to the TARGET Node.

Returns
Return type: A dict mapping TARGET Nodes -> target values

_parse_learning_spec(inputs, targets, execution_mode, context)¶

Converts learning inputs and targets to a standardized form

Returns

dict – Dict mapping mechanisms to values (with TargetMechanisms inferred from output nodes if needed)
int – Number of input sets in dict for each input node in the Composition

_identify_target_nodes(context)¶: Recursively call all nested AutodiffCompositions to assign TARGET nodes for learning

learn(*args, synch_projection_matrices_with_torch=NotImplemented, synch_node_variables_with_torch=NotImplemented, synch_node_values_with_torch=NotImplemented, synch_results_with_torch=NotImplemented, retain_torch_trained_outputs=NotImplemented, retain_torch_targets=NotImplemented, retain_torch_losses=NotImplemented, **kwargs)¶

Override to handle synch and retain args Note: defaults for synch and retain args are set to NotImplemented, so that the user can specify None if

they want to locally override the default values for the AutodiffComposition (see docstrings for run() and _parse_synch_and_retain_args() for additonal details).

Return type: list

_get_execution_mode(execution_mode)¶: Parse execution_mode argument and return a valid execution mode for the learn() method Can be overridden by subclasses to change the permitted and/or default execution mode for learning

execute(inputs=None, num_trials=None, minibatch_size=1, optimizations_per_minibatch=1, do_logging=False, scheduler=None, termination_processing=None, call_before_minibatch=None, call_after_minibatch=None, call_before_time_step=None, call_before_pass=None, call_after_time_step=None, call_after_pass=None, reset_stateful_functions_to=None, context=None, base_context=<psyneulink.core.globals.context.Context object>, clamp_input='soft_clamp', targets=None, runtime_params=None, execution_mode=ExecutionMode.PyTorch, skip_initialization=False, synch_with_pnl_options=None, retain_in_pnl_options=None, report_output=ReportOutput.OFF, report_params=ReportParams.OFF, report_progress=ReportProgress.OFF, report_simulations=ReportSimulations.OFF, report_to_devices=None, report=None, report_num=None)¶

Override to execute autodiff_forward() in learning mode if execute_mode is not Python

Return type: ndarray

run(*args, synch_projection_matrices_with_torch=NotImplemented, synch_node_variables_with_torch=NotImplemented, synch_node_values_with_torch=NotImplemented, synch_results_with_torch=NotImplemented, retain_torch_trained_outputs=NotImplemented, retain_torch_targets=NotImplemented, retain_torch_losses=NotImplemented, batched_results=False, **kwargs)¶: Override to handle synch and retain args if run called directly from run() rather than learn() Note: defaults for synch and retain args are NotImplemented, so that the user can specify None if they want

to locally override the default values for the AutodiffComposition (see _parse_synch_and_retain_args() for details). This is distinct from the user assigning the Parameter default_values(s), which is done in the AutodiffComposition constructor and handled by the Parameter._specify_none attribute.

_update_results(results, trial_output, execution_mode, synch_with_pnl_options, context)¶: Update results by appending most recent trial_output This is included as a helper so it can be overriden by subclasses (such as AutodiffComposition) that may need to do this less frequently for scallable exeuction

save(path=None, directory=None, filename=None, context=None)¶

Saves all weight matrices for all MappingProjections in the AutodiffComposition

Parameters

path (Path, PosixPath or str : default None) – path specification; must be a legal path specification in the filesystem.
directory (str : default current working directory) – directory where matrices for all MappingProjections in the AutodiffComposition are saved.
filename (str : default <name of AutodiffComposition>_matrix_wts.pnl) – filename in which matrices for all MappingProjections in the AutodiffComposition are saved.
note:: (.) – Matrices are saved in PyTorch state_dict format.

Returns

Return type

Path

load(path=None, directory=None, filename=None, context=None, weights_only=False)¶

Loads all weight matrices for all MappingProjections in the AutodiffComposition from file :type path: PosixPath :param path: Path for file in which MappingProjection matrices are stored.

This must be a legal PosixPath object; if it is specified directory and filename are ignored.

Parameters

directory (str : default current working directory) – directory where MappingProjection matrices are stored.
filename (str : default <name of AutodiffComposition>_matrix_wts.pnl) – name of file in which MappingProjection matrices are stored.
note:: (.) –
Matrices must be stored in PyTorch state_dict format.

parameters = <psyneulink.library.compositions.autodiffcomposition.AutodiffComposition.Parameters object> : ( device = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='device' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), execute_until_finished = Parameter( default_value=True delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='execute_until_finished' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), execution_count = Parameter( default_value=array(0) delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='execution_count' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=False structural=False user=True values={} ), has_initializers = Parameter( default_value=False delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='has_initializers' parse_spec=False pnl_internal=True port=None reference=False setter=<function _has_initializers_setter> specify_none=False stateful=True structural=False user=True values={} ), input_specification = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='input_specification' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=False structural=False user=True values={} ), is_finished_flag = Parameter( default_value=True delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='is_finished_flag' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), learning_rate = Parameter( default_value=array(0.001) delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='learning_rate' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), learning_results = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='learning_results' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), max_executions_before_finished = Parameter( default_value=array(1000) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='max_executions_before_finished' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), minibatch_size = Parameter( default_value=array(1) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None modulable=True modulation_combination_function=None name='minibatch_size' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), num_executions = Parameter( default_value=Time(run: 0, trial: 0, pass: 0, time_step: 0) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='num_executions' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), num_executions_before_finished = Parameter( default_value=array(0) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='num_executions_before_finished' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), optimizations_per_minibatch = Parameter( default_value=array(1) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None modulable=True modulation_combination_function=None name='optimizations_per_minibatch' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), optimizer = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='optimizer' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), pytorch_representation = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='pytorch_representation' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), results = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='results' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), retain_old_simulation_data = Parameter( default_value=False delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='retain_old_simulation_data' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=False structural=False user=True values={} ), retain_torch_losses = Parameter( default_value='minibatch' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='retain_torch_losses' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), retain_torch_targets = Parameter( default_value='minibatch' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='retain_torch_targets' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), retain_torch_trained_outputs = Parameter( default_value='minibatch' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='retain_torch_trained_outputs' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), simulation_results = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=False mdf_name=None name='simulation_results' parse_spec=False pnl_internal=True port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), synch_node_values_with_torch = Parameter( default_value='run' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='synch_node_values_with_torch' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), synch_node_variables_with_torch = Parameter( default_value=None delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='synch_node_variables_with_torch' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), synch_projection_matrices_with_torch = Parameter( default_value='run' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='synch_projection_matrices_with_torch' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), synch_results_with_torch = Parameter( default_value='run' delivery_condition=<LogCondition.OFF: 0> dependencies=None fallback_default=True function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='synch_results_with_torch' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), torch_losses = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True getter=<function _get_torch_losses> history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='torch_losses' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), torch_targets = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True getter=<function _get_torch_targets> history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='torch_targets' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), torch_trained_outputs = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True getter=<function _get_torch_trained_outputs> history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='torch_trained_outputs' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), trial_losses = Parameter( default_value=[] delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='trial_losses' parse_spec=False pnl_internal=False port=None reference=False specify_none=False stateful=True structural=False user=True values={} ), value = Parameter( default_value=NotImplemented delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='value' parse_spec=False pnl_internal=False port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), variable = Parameter( constructor_argument='default_variable' default_value=array([0]) delivery_condition=<LogCondition.OFF: 0> dependencies=None function_arg=True history={} history_max_length=1 history_min_length=0 loggable=True mdf_name=None name='variable' parse_spec=False pnl_internal=True port=None read_only=True reference=False specify_none=False stateful=True structural=False user=True values={} ), )¶

show_graph(*args, **kwargs)¶: Override to use PytorchShowGraph if show_pytorch is True