ParameterEstimationComposition¶
Contents¶
Overview¶
A ParameterEstimationComposition is a subclass of Composition that is used to estimate specified parameters of a model Composition,
in order to fit the outputs
of the model to a set of data (Data Fitting)
via likelihood maximization using kernel density estimation (KDE), or to optimize a user provided scalar
objective_function (Parameter Optimization). In either case, when the
ParameterEstimationComposition is run with a given set of inputs,
it returns the set of parameter values in its optimized_parameter_values attribute that it estimates best satisfy either of those
conditions. The results attribute are also set to the optimal parameter
values. The arguments below are used to configure a ParameterEstimationComposition for either
Data Fitting or Parameter Optimization, followed by sections
that describe arguments specific to each.
model - specifies the Composition whose
parametersare to be estimated.Note
Neither the controller nor any of its associated arguments can be specified in the constructor for a ParameterEstimationComposition; this is constructed automatically using the arguments described below.
parameters - specifies the
parametersof themodelto be estimated. These are specified in a dict, in which the key of each entry specifies a parameter to estimate, and its value is a list values to sample for that parameter.outcome_variables - specifies the
OUTPUTNodes of themodel, thevaluesof which are used to evaluate the fit of the different combinations ofparametervalues sampled. An important limitation of the PEC is that theoutcome_variablesmust be a subset of the output ports of themodel’s terminal Mechanism.optimization_function - specifies the function used to search over the combinations of
parametervalues to be estimated. This must be either an instance ofPECOptimizationFunctionor a string name of one of the supported optimizers.num_estimates - specifies the number of independent samples that are estimated for a given combination of
parametervalues.num_trials_per_estimate - specifies the number of trials executed when the
modelis run for each estimate of a combination ofparametervalues. Typically, this can be left unspecified and the model will be run until all trials of inputs are exhausted.
Data Fitting¶
The ParameterEstimationComposition can be used to find a set of parameters for the model such that, when it is run with a given set of inputs, its results
best match (maximum likelihood) a specified set of empirical data. This requires that the data argument be
specified:
data - specifies the data to which the
outcome_variablesare fit in the estimation process. They must be in a format that aligns with the specification of theoutcome_variables. The parameter data should be a pandas DataFrame where each column corresponds to one of theoutcome_variables. If one of the outcome variables should be treated as a categorical variable (e.g. a decision value in a two-alternative forced choice task modeled by a DDM), the it should be specified as a pandas Categorical variable.
objective_function - A function that computes the sum of the log likelihood of the data is automatically assigned for data fitting purposes and should not need to be specified. This function uses a kernel density estimation of the data to compute the likelihood of the data given the model. If you would like to use your own estimation of the likelhood, see Parameter Optimization below.
Warning
The objective_function argument should NOT be specified for data fitting; specifying both the data and objective_function arguments generates an error.
Parameter Optimization¶
The ParameterEstimationComposition can be used to find a set of parameters for the model such that, when it is run with a given set of inputs, its results
either maximize or minimize the objective_function, as determined by the optimization_function. This
requires that the objective_function argument be specified:
objective_function - specifies a function used to evaluate the
valuesof theoutcome_variables, according to which combinations ofparametersare assessed; this must be anCallablethat takes a 3d array as its only argument, the shape of which will be (num_estimates, num_trials, number of outcome_variables). The function should specify how to aggregate the value of each outcome_variable over num_estimates and/or num_trials if either is greater than 1.Warning
The data argument should NOT be specified for parameter optimization; specifying both the objective_function and the data arguments generates an error.
Supported Optimizers¶
Structure¶
ParameterEstimationComposition uses a PEC_OCM as its controller – a specialized
subclass of OptimizationControlMechanism that intercepts inputs provided to the run method of the ParameterEstimationComposition, and assigns them directly
to the state_feature_values of the PEC_OCM when it executes.
Class Reference¶
- class psyneulink.core.compositions.parameterestimationcomposition.ParameterEstimationComposition(parameters, outcome_variables, optimization_function, model=None, data=None, likelihood_include_mask=None, data_categorical_dims=None, objective_function=None, num_estimates=1, num_trials_per_estimate=None, initial_seed=None, same_seed_for_all_parameter_combinations=None, depends_on=None, name=None, context=None, **kwargs)¶
Subclass of Composition that estimates specified parameters either to fit the results of a Composition to a set of data or to optimize a specified function.
Automatically implements an OptimizationControlMechanism as its
controller, that is constructed using arguments to the ParameterEstimationComposition’s constructor as described below.The following arguments are those specific to ParmeterEstimationComposition; see Composition for additional arguments
- Parameters:
parameters (
dict) – specifies theparametersof themodelto be estimated. These are specified in a dict, in which the key of each entry specifies a parameter to estimate, and its value is a list values to sample for that parameter.depends_on (
Optional[Mapping]) – A dictionary that specifies which parameters depend on a condition. The keys of the dictionary are the specified identically to the keys of the parameters dictionary. The values are a string that specifies a column in the data that the parameter depends on. The values of this column must be categorical. Each unique value will represent a condition and will result in a separate parameter being estimated for it. The number of unique values should be small because each unique value will result in a separate parameter being estimated.outcome_variables (
Union[list[Mechanism],Mechanism,list[OutputPort],OutputPort]) – specifies theOUTPUTNodes of themodel, thevaluesof which are used to evaluate the fit of the different combinations ofparametervalues sampled. An important limitation of the PEC is that theoutcome_variablesmust be a subset of the output ports of themodel’s terminal Mechanism.model (
Optional[Composition]) – specifies an external Composition for which parameters are to be fit to data or optimized according to a specifiedobjective_function.data (
Optional[DataFrame]) – specifies the data to which theoutcome_variablesare fit in the estimation process. They must be in a format that aligns with the specification of theoutcome_variables. The parameter data should be a pandas DataFrame where each column corresponds to one of theoutcome_variables. If one of the outcome variables should be treated as a categorical variable (e.g. a decision value in a two-alternative forced choice task modeled by a DDM), the it should be specified as a pandas Categorical variable.data_categorical_dims (Union[Iterable] : default None) – specifies the dimensions of the data that are categorical. If a list of boolean values is provided, it is assumed to be a mask for the categorical data dimensions and must have the same length as columns in data. If it is an iterable of integers, it is assumed to be a list of the categorical dimensions indices. If it is None, all data dimensions are assumed to be continuous. Alternatively, if data is a pandas DataFrame, then the columns which have Category dtype are assumed to be categorical.
objective_function (ObjectiveFunction, function or method) – specifies the function used by optimization_function (see
objective_functionfor additional information); the shape of itsvariableargument (i.e., its first positional argument) must be the same as an array containing thevalueof the OutputPort corresponding to each item specified inoutcome_variables.optimization_function (OptimizationFunction, function or method : default or MaximumLikelihood or GridSearch) – specifies the function used to search over the combinations of
parametervalues to be estimated. This must be either an instance ofPECOptimizationFunctionor a string name of one of the supported optimizers.num_estimates (int : default 1) – specifies the number of estimates made for a each combination of parameter values (see
num_estimatesfor additional information); it is passed to the ParameterEstimationComposition’scontrollerto set itsnum_estimatesParameter.num_trials_per_estimate (int : default None) – specifies an exact number of trials to execute for each run of the
modelwhen estimating each combination ofparametervalues (seenum_trials_per_estimatefor additional information).initial_seed (int : default None) – specifies the seed used to initialize the random number generator at construction; it is passed to the ParameterEstimationComposition’s
controllerto set itsinitial_seedParameter.same_seed_for_all_parameter_combinations (bool : default False) – specifies whether the random number generator is re-initialized to the same value when estimating each combination of
parametervalues; it is passed to the ParameterEstimationComposition’scontrollerto set itssame_seed_for_all_allocationsParameter.
- model¶
identifies the Composition used for Data Fitting or Parameter Optimization. If the model argument of the ParameterEstimationComposition’s constructor is not specified,
modelreturns the ParameterEstimationComposition itself.- Type:
- parameters¶
determines the parameters of the
modelused for Data Fitting or Parameter Optimization (seecontrolfor additional details).- Type:
list[Parameters]
- parameter_ranges_or_priors¶
determines the range of values evaluated for each
parameter. These are assigned as theallocation_samplesfor the ControlSignal assigned to the ParameterEstimationComposition’s OptimizationControlMechanism corresponding to each of the specifiedparameters.- Type:
List[Union[Iterator, Function, ist or Value]
- outcome_variables¶
determines the
OUTPUTNodes of themodel, thevaluesof which are either compared to the data when the ParameterEstimationComposition is used for Data Fitting, or evaluated by the ParameterEstimationComposition’soptimization_functionwhen it is used for Parameter Optimization.- Type:
list[Composition Output Nodes]
- data¶
determines the data to be fit by the
modelwhen the ParameterEstimationComposition is used for Data Fitting. These must be structured in form that aligns with the specifiedoutcome_variables(see data for additional details). The data are passed to the optimizer used byoptimization_function. Returns None if the model is being used for Parameter Optimization.- Type:
array
- objective_function¶
determines the function used to evaluate the
resultsof themodelunder each set ofparametervalues. It is passed to the ParameterEstimationComposition’s OptimizationControlMechanism as the function of itsobjective_mechanism, that is used to compute thenet_outcomefor of themodeleach time it isrun(see objective_function for additional details).- Type:
ObjectiveFunction, function or method
- optimization_function¶
determines the function used to estimate the parameters of the
modelthat either best fit thedatawhen the ParameterEstimationComposition is used for Data Fitting, or that achieve some maximum or minimum value of theoptimization_functionwhen the ParameterEstimationComposition is used for Parameter Optimization. This is assigned as thefunctionof the ParameterEstimationComposition’s OptimizationControlMechanism.- Type:
- num_estimates¶
determines the number of estimates of the
net_outcomeof themodel(i.e., number of calls to itsevaluatemethod) for a given combination ofparametervalues (i.e.,control_allocation) evaluated.- Type:
int
- num_trials_per_estimate¶
imposes an exact number of trials to be executed in each run of
modelused to evaluate itsnet_outcomeby a call to its OptimizationControlMechanism’sevaluate_agent_repmethod. If it is None (the default), then either the number of inputs or the value specified for num_trials in the ParameterEstimationComposition’srunmethod used to determine the number of trials executed (see number of trials for additional information).Note
The num_trials_per_estimate is distinct from the num_trials argument of the ParameterEstimationComposition’s
runmethod. The latter determines how many full fits of themodelare carried out (that is, how many times the ParameterEstimationComposition itself is run), whereas num_trials_per_estimate determines how many trials are run for a given combination ofparametervalues within each fit.- Type:
int or None
- initial_seed¶
contains the seed used to initialize the random number generator at construction, that is stored on the ParameterEstimationComposition’s
controller, and setting it sets the value of that Parameter (seeinitial_seedfor additional details).- Type:
int or None
- same_seed_for_all_parameter_combinations¶
contains the setting for determining whether the random number generator used to select seeds for each estimate of the
model'snet_outcomeis re-initialized to the same value for each combination of parameter values evaluated. Its values is stored on the ParameterEstimationComposition’scontroller, and setting it sets the value of that Parameter (seesame_seed_for_all_allocationsfor additional details).- Type:
bool
- optimized_parameter_values¶
contains the values of the
parametersof themodelthat best fit thedatawhen the ParameterEstimationComposition is used for Data Fitting, or that optimize performance of themodelaccording to theoptimization_functionwhen the ParameterEstimationComposition is used for Parameter Optimization. Ifparameter valuesare specified as ranges of values, then each item ofoptimized_parameter_valuesis the optimized value of the correspondingparameter. Ifparameter valuesare specified as priors, then each item ofoptimized_parameter_valuesis an array containing the values of the correspondingparameterthe distribution of which were determined to be optimal.- Type:
list
- optimal_value¶
contains the results returned by execution of
agent_repfor the parameter values inoptimized_parameter_values.- Type:
float
- results¶
contains the
output_valuesof theOUTPUTNodes in themodelfor everyTRIALexecuted (seeComposition.resultsfor more details). If the ParameterEstimationComposition is used for Data Fitting, andparameter valuesare specified as ranges of values, then each item ofresultsis an array ofoutput_values(sampled overnum_estimates) obtained for the single optimized combination ofparametervalues contained in the corresponding item ofoptimized_parameter_values. Ifparameter valuesare specified as priors, then each item ofresultsis an array ofoutput_values(sampled overnum_estimates), each of which corresponds to a combination ofparametervalues that were used to generate those results; it is the distribution of thoseparametervalues that were found to best fit the data.- Type:
list[list[list]]
- class Parameters(owner, parent=None)¶
- initial_seed¶
-
- Default value:
None
- Type:
int
- same_seed_for_all_parameter_combinations¶
-
- Default value:
False
- Type:
bool
- _validate_data()¶
Check if user supplied data to fit is valid for data fitting mode.
- log_likelihood(*args, inputs=None, context=None)¶
Compute the log-likelihood of the data given the specified parameters of the model.
- Parameters:
*args – Positional args, one for each paramter of the model. These must correspond directly to the parameters that have been specified in the
parametersargument of the constructor.- Return type:
The sum of the log-likelihoods of the data given the specified parameters of the model.
- _complete_init_of_partially_initialized_nodes(context)¶
- Attempt to complete initialization of aux_components for any nodes with
aux_components that were not previously compatible with Composition
- exception psyneulink.core.compositions.parameterestimationcomposition.ParameterEstimationCompositionError(message, component=None)¶