ParameterEstimationComposition¶
Contents¶
Overview¶
A ParameterEstimationComposition is a subclass of Composition that is used to estimate specified parameters
of a model
Composition,
in order to fit the outputs
of the model
to a set of data (Data Fitting)
via likelihood maximization using kernel density estimation (KDE), or to optimize a user provided scalar
objective_function
(Parameter Optimization). In either case, when the
ParameterEstimationComposition is run
with a given set of inputs,
it returns the set of parameter values in its optimized_parameter_values
attribute that it estimates best satisfy either of those
conditions. The results
attribute are also set to the optimal parameter
values. The arguments below are used to configure a ParameterEstimationComposition for either
Data Fitting or Parameter Optimization, followed by sections
that describe arguments specific to each.
model - specifies the Composition whose
parameters
are to be estimated.Note
Neither the controller nor any of its associated arguments can be specified in the constructor for a ParameterEstimationComposition; this is constructed automatically using the arguments described below.
parameters - specifies the
parameters
of themodel
to be estimated. These are specified in a dict, in which the key of each entry specifies a parameter to estimate, and its value is a list values to sample for that parameter.outcome_variables - specifies the
OUTPUT
Nodes of themodel
, thevalues
of which are used to evaluate the fit of the different combinations ofparameter
values sampled. An important limitation of the PEC is that theoutcome_variables
must be a subset of the output ports of themodel
’s terminal Mechanism.optimization_function - specifies the function used to search over the combinations of
parameter
values to be estimated. This must be either an instance ofPECOptimizationFunction
or a string name of one of the supported optimizers.num_estimates - specifies the number of independent samples that are estimated for a given combination of
parameter
values.num_trials_per_estimate - specifies the number of trials executed when the
model
is run for each estimate of a combination ofparameter
values. Typically, this can be left unspecified and the model will be run until all trials of inputs are exhausted.
Data Fitting¶
The ParameterEstimationComposition can be used to find a set of parameters for the model
such that, when it is run with a given set of inputs, its results
best match (maximum likelihood) a specified set of empirical data. This requires that the data argument be
specified:
data - specifies the data to which the
outcome_variables
are fit in the estimation process. They must be in a format that aligns with the specification of theoutcome_variables
. The parameter data should be a pandas DataFrame where each column corresponds to one of theoutcome_variables
. If one of the outcome variables should be treated as a categorical variable (e.g. a decision value in a two-alternative forced choice task modeled by a DDM), the it should be specified as a pandas Categorical variable.
objective_function - A function that computes the sum of the log likelihood of the data is automatically assigned for data fitting purposes and should not need to be specified. This function uses a kernel density estimation of the data to compute the likelihood of the data given the model. If you would like to use your own estimation of the likelhood, see Parameter Optimization below.
Warning
The objective_function argument should NOT be specified for data fitting; specifying both the data and objective_function arguments generates an error.
Parameter Optimization¶
The ParameterEstimationComposition can be used to find a set of parameters for the model
such that, when it is run with a given set of inputs, its results
either maximize or minimize the objective_function, as determined by the optimization_function. This
requires that the objective_function argument be specified:
objective_function - specifies a function used to evaluate the
values
of theoutcome_variables
, according to which combinations ofparameters
are assessed; this must be anCallable
that takes a 3d array as its only argument, the shape of which will be (num_estimates, num_trials, number of outcome_variables). The function should specify how to aggregate the value of each outcome_variable over num_estimates and/or num_trials if either is greater than 1.Warning
The data argument should NOT be specified for parameter optimization; specifying both the objective_function and the data arguments generates an error.
Supported Optimizers¶
Structure¶
ParameterEstimationComposition uses a PEC_OCM
as its controller
– a specialized
subclass of OptimizationControlMechanism that intercepts inputs provided to the run
method of the ParameterEstimationComposition, and assigns them directly
to the state_feature_values
of the PEC_OCM when it executes.
Class Reference¶
- class psyneulink.core.compositions.parameterestimationcomposition.ParameterEstimationComposition(parameters, outcome_variables, optimization_function, model=None, data=None, likelihood_include_mask=None, data_categorical_dims=None, objective_function=None, num_estimates=1, num_trials_per_estimate=None, initial_seed=None, same_seed_for_all_parameter_combinations=None, depends_on=None, name=None, context=None, **kwargs)¶
Subclass of Composition that estimates specified parameters either to fit the results of a Composition to a set of data or to optimize a specified function.
Automatically implements an OptimizationControlMechanism as its
controller
, that is constructed using arguments to the ParameterEstimationComposition’s constructor as described below.The following arguments are those specific to ParmeterEstimationComposition; see Composition for additional arguments
- Parameters
parameters (
dict
) – specifies theparameters
of themodel
to be estimated. These are specified in a dict, in which the key of each entry specifies a parameter to estimate, and its value is a list values to sample for that parameter.depends_on (
Optional
[Mapping
]) – A dictionary that specifies which parameters depend on a condition. The keys of the dictionary are the specified identically to the keys of the parameters dictionary. The values are a string that specifies a column in the data that the parameter depends on. The values of this column must be categorical. Each unique value will represent a condition and will result in a separate parameter being estimated for it. The number of unique values should be small because each unique value will result in a separate parameter being estimated.outcome_variables (
Union
[list
[Mechanism
],Mechanism
,list
[OutputPort
],OutputPort
]) – specifies theOUTPUT
Nodes of themodel
, thevalues
of which are used to evaluate the fit of the different combinations ofparameter
values sampled. An important limitation of the PEC is that theoutcome_variables
must be a subset of the output ports of themodel
’s terminal Mechanism.model (
Optional
[Composition
]) – specifies an external Composition for which parameters are to be fit to data or optimized according to a specifiedobjective_function
.data (
Optional
[DataFrame
]) – specifies the data to which theoutcome_variables
are fit in the estimation process. They must be in a format that aligns with the specification of theoutcome_variables
. The parameter data should be a pandas DataFrame where each column corresponds to one of theoutcome_variables
. If one of the outcome variables should be treated as a categorical variable (e.g. a decision value in a two-alternative forced choice task modeled by a DDM), the it should be specified as a pandas Categorical variable.data_categorical_dims (Union[Iterable] : default None) – specifies the dimensions of the data that are categorical. If a list of boolean values is provided, it is assumed to be a mask for the categorical data dimensions and must have the same length as columns in data. If it is an iterable of integers, it is assumed to be a list of the categorical dimensions indices. If it is None, all data dimensions are assumed to be continuous. Alternatively, if data is a pandas DataFrame, then the columns which have Category dtype are assumed to be categorical.
objective_function (ObjectiveFunction, function or method) – specifies the function used by optimization_function (see
objective_function
for additional information); the shape of itsvariable
argument (i.e., its first positional argument) must be the same as an array containing thevalue
of the OutputPort corresponding to each item specified inoutcome_variables
.optimization_function (OptimizationFunction, function or method : default or MaximumLikelihood or GridSearch) – specifies the function used to search over the combinations of
parameter
values to be estimated. This must be either an instance ofPECOptimizationFunction
or a string name of one of the supported optimizers.num_estimates (int : default 1) – specifies the number of estimates made for a each combination of parameter values (see
num_estimates
for additional information); it is passed to the ParameterEstimationComposition’scontroller
to set itsnum_estimates
Parameter.num_trials_per_estimate (int : default None) – specifies an exact number of trials to execute for each run of the
model
when estimating each combination ofparameter
values (seenum_trials_per_estimate
for additional information).initial_seed (int : default None) – specifies the seed used to initialize the random number generator at construction; it is passed to the ParameterEstimationComposition’s
controller
to set itsinitial_seed
Parameter.same_seed_for_all_parameter_combinations (bool : default False) – specifies whether the random number generator is re-initialized to the same value when estimating each combination of
parameter
values; it is passed to the ParameterEstimationComposition’scontroller
to set itssame_seed_for_all_allocations
Parameter.
- model¶
identifies the Composition used for Data Fitting or Parameter Optimization. If the model argument of the ParameterEstimationComposition’s constructor is not specified,
model
returns the ParameterEstimationComposition itself.- Type
- parameters¶
determines the parameters of the
model
used for Data Fitting or Parameter Optimization (seecontrol
for additional details).- Type
list[Parameters]
- parameter_ranges_or_priors¶
determines the range of values evaluated for each
parameter
. These are assigned as theallocation_samples
for the ControlSignal assigned to the ParameterEstimationComposition’s OptimizationControlMechanism corresponding to each of the specifiedparameters
.- Type
List[Union[Iterator, Function, ist or Value]
- outcome_variables¶
determines the
OUTPUT
Nodes of themodel
, thevalues
of which are either compared to the data when the ParameterEstimationComposition is used for Data Fitting, or evaluated by the ParameterEstimationComposition’soptimization_function
when it is used for Parameter Optimization.- Type
list[Composition Output Nodes]
- data¶
determines the data to be fit by the
model
when the ParameterEstimationComposition is used for Data Fitting. These must be structured in form that aligns with the specifiedoutcome_variables
(see data for additional details). The data are passed to the optimizer used byoptimization_function
. Returns None if the model is being used for Parameter Optimization.- Type
array
- objective_function¶
determines the function used to evaluate the
results
of themodel
under each set ofparameter
values. It is passed to the ParameterEstimationComposition’s OptimizationControlMechanism as the function of itsobjective_mechanism
, that is used to compute thenet_outcome
for of themodel
each time it isrun
(see objective_function for additional details).- Type
ObjectiveFunction, function or method
- optimization_function¶
determines the function used to estimate the parameters of the
model
that either best fit thedata
when the ParameterEstimationComposition is used for Data Fitting, or that achieve some maximum or minimum value of theoptimization_function
when the ParameterEstimationComposition is used for Parameter Optimization. This is assigned as thefunction
of the ParameterEstimationComposition’s OptimizationControlMechanism.- Type
- num_estimates¶
determines the number of estimates of the
net_outcome
of themodel
(i.e., number of calls to itsevaluate
method) for a given combination ofparameter
values (i.e.,control_allocation
) evaluated.- Type
int
- num_trials_per_estimate¶
imposes an exact number of trials to be executed in each run of
model
used to evaluate itsnet_outcome
by a call to its OptimizationControlMechanism’sevaluate_agent_rep
method. If it is None (the default), then either the number of inputs or the value specified for num_trials in the ParameterEstimationComposition’srun
method used to determine the number of trials executed (see number of trials for additional information).Note
The num_trials_per_estimate is distinct from the num_trials argument of the ParameterEstimationComposition’s
run
method. The latter determines how many full fits of themodel
are carried out (that is, how many times the ParameterEstimationComposition itself is run), whereas num_trials_per_estimate determines how many trials are run for a given combination ofparameter
values within each fit.- Type
int or None
- initial_seed¶
contains the seed used to initialize the random number generator at construction, that is stored on the ParameterEstimationComposition’s
controller
, and setting it sets the value of that Parameter (seeinitial_seed
for additional details).- Type
int or None
- same_seed_for_all_parameter_combinations¶
contains the setting for determining whether the random number generator used to select seeds for each estimate of the
model
'snet_outcome
is re-initialized to the same value for each combination of parameter values evaluated. Its values is stored on the ParameterEstimationComposition’scontroller
, and setting it sets the value of that Parameter (seesame_seed_for_all_allocations
for additional details).- Type
bool
- optimized_parameter_values¶
contains the values of the
parameters
of themodel
that best fit thedata
when the ParameterEstimationComposition is used for Data Fitting, or that optimize performance of themodel
according to theoptimization_function
when the ParameterEstimationComposition is used for Parameter Optimization. Ifparameter values
are specified as ranges of values, then each item ofoptimized_parameter_values
is the optimized value of the correspondingparameter
. Ifparameter values
are specified as priors, then each item ofoptimized_parameter_values
is an array containing the values of the correspondingparameter
the distribution of which were determined to be optimal.- Type
list
- optimal_value¶
contains the results returned by execution of
agent_rep
for the parameter values inoptimized_parameter_values
.- Type
float
- results¶
contains the
output_values
of theOUTPUT
Nodes in themodel
for everyTRIAL
executed (seeComposition.results
for more details). If the ParameterEstimationComposition is used for Data Fitting, andparameter values
are specified as ranges of values, then each item ofresults
is an array ofoutput_values
(sampled overnum_estimates
) obtained for the single optimized combination ofparameter
values contained in the corresponding item ofoptimized_parameter_values
. Ifparameter values
are specified as priors, then each item ofresults
is an array ofoutput_values
(sampled overnum_estimates
), each of which corresponds to a combination ofparameter
values that were used to generate those results; it is the distribution of thoseparameter
values that were found to best fit the data.- Type
list[list[list]]
- class Parameters(owner, parent=None)¶
- initial_seed¶
-
- Default value
None
- Type
int
- same_seed_for_all_parameter_combinations¶
-
- Default value
False
- Type
bool
- _validate_data()¶
Check if user supplied data to fit is valid for data fitting mode.
- log_likelihood(*args, inputs=None, context=None)¶
Compute the log-likelihood of the data given the specified parameters of the model.
- Parameters
*args – Positional args, one for each paramter of the model. These must correspond directly to the parameters that have been specified in the
parameters
argument of the constructor.- Returns
- Return type
The sum of the log-likelihoods of the data given the specified parameters of the model.
- _complete_init_of_partially_initialized_nodes(context)¶
- Attempt to complete initialization of aux_components for any nodes with
aux_components that were not previously compatible with Composition
- exception psyneulink.core.compositions.parameterestimationcomposition.ParameterEstimationCompositionError(message, component=None)¶