Core Concepts

Researchers should use likelihood-based methods instead of closeness measures for fitting, evaluating, and comparing cognitive architecture models because likelihood offers a statistically sound approach with advantages in interpretability, model selection, and handling individual differences.

Abstract

Stocco, A., Mitsopoulos, K., Yang, Y. C., Hake, H. S., Haile, T., Leonard, B., & Gluck, K. (Year). Fitting, Evaluating, and Comparing Cognitive Architecture Models Using Likelihood: A Primer With Examples in ACT-R.

This paper aims to introduce and advocate for the use of likelihood-based methods in fitting, evaluating, and comparing cognitive architecture models, specifically focusing on the ACT-R architecture. The authors argue that likelihood offers a more statistically sound and advantageous approach compared to traditional closeness measures.

The authors provide a comprehensive tutorial on implementing likelihood-based methods using examples from the ACT-R cognitive architecture. They explain the concept of likelihood, its advantages, and how to apply it in different modeling scenarios, including individual and group level analysis. The paper utilizes data from the Incentive Processing Task (IPT) performed by participants from the Human Connectome Project dataset to illustrate the methods.

The paper demonstrates how likelihood-based methods can be used to:

- Estimate the plausibility of a model reproducing observed data.
- Explore the parameter space of a model to find the best-fitting parameters.
- Fit models using multiple behavioral measures simultaneously.
- Analyze data at both aggregate and trial-by-trial levels.
- Compare and select between competing models using criteria like AIC, BIC, and Bayes Factors.

The authors conclude that adopting likelihood-based methods can significantly benefit cognitive architecture research by providing a statistically rigorous framework for model validation, comparison, and selection. They encourage researchers to move away from traditional closeness measures and embrace likelihood for more robust and reliable model evaluation.

This paper provides a valuable resource for researchers in the field of cognitive architectures by offering a practical guide to implementing likelihood-based methods. The use of concrete examples and the availability of accompanying code make the concepts accessible and encourage wider adoption of these techniques.

The paper primarily focuses on the ACT-R architecture. While the authors suggest that the principles are generalizable to other cognitive architectures, further research is needed to explore the specific implementations and challenges in other frameworks. Additionally, the paper focuses on relatively simple models and tasks. Future work should investigate the application of likelihood-based methods to more complex cognitive models and real-world tasks.

To Another Language

from source content

arxiv.org

Stats

Out of 15 ACT-R journal papers published in the past two years, none makes use of likelihood to evaluate models.
The example data comes from two participants from the HCP dataset.
The task was presented in two runs, each of which contains 32 trials divided into four blocks.
The model with default parameters d = 0.5 and s = 0.25 was simulated for 5,000 runs.
According to Powell’s method, the best-fitting parameters are d = 0.47, s = 0.65.
The optimized parameter values for Participant 2 are d = 0.263 and s = 0.432.
The model described has a maximum likelihood value of –45.88 for participant 1 and –35.64 for participant 2.

Quotes

"Likelihood is a foundational concept in statistics, and has been advocated for and used to estimate parameters and compare formal models in mathematical psychology [5], computational neuroscience [6], computational psychiatry [7], and computational biology [8]."
"By adopting likelihood, researchers in the cognitive architecture domains can take advantage of modern ideas in statistics and make deeper connections with other modeling communities."
"Because the core equations underlying ACT-R and other cognitive architectures are often non-linear, error-based measures like MSE are biased and more likely to be thrown off by observations that are associated with greater errors (for example, outliers)."
"This flexibility is particularly important because modeling individual differences has become an important application of computational models [10–14]."

Key Insights Distilled From

by Andrea Stocc... at **arxiv.org** 10-24-2024

Deeper Inquiries

Likelihood-based methods offer a powerful approach to evaluate the generalizability of cognitive architecture models across different tasks and populations. Here's how:
1. Cross-Task Generalization:
Model Fitting and Comparison: Fit the same cognitive architecture model, with minimal or no parameter modifications, to data from multiple tasks that tap into related cognitive processes.
Likelihood Ratios and Bayes Factors: Use likelihood ratios or Bayes Factors to compare the model's goodness-of-fit across tasks. Consistent superior performance on one model across tasks provides evidence for its generalizability.
Parameter Stability: Assess the stability of estimated model parameters across tasks. If the same parameter values, or values within a close range, consistently lead to good fits across tasks, it suggests that the model captures a general cognitive mechanism.
2. Cross-Population Generalization:
Individual-Level Fitting: Fit the model to data from individuals belonging to different populations (e.g., different age groups, cultural backgrounds, or clinical diagnoses).
Group-Level Likelihood: Calculate group-level likelihoods for each population, allowing for potential variations in parameter values across groups.
Population Differences: Compare model fit and parameter estimates across populations. Significant differences might indicate the need for population-specific model adjustments or highlight variations in underlying cognitive processes.
3. Hierarchical Bayesian Modeling:
Population-Level Parameters: Employ hierarchical Bayesian modeling to estimate both individual-level and population-level parameters. This approach allows for sharing information across individuals within a population, improving parameter estimation, especially for smaller sample sizes.
Population Differences: Hierarchical models can explicitly test for differences in parameter distributions across populations, providing a statistically rigorous way to assess generalizability and identify potential variations.
Example: Imagine evaluating an ACT-R model of working memory across a visual search task and an n-back task. By fitting the model to data from both tasks and comparing likelihoods, researchers can assess if the model, with its assumptions about memory retrieval and decay, generalizes across these tasks. Similarly, fitting the model to data from younger and older adults can reveal if age-related differences in working memory can be accounted for by changes in specific model parameters.

You are right to point out that the reliance of likelihood on specific probability distributions can pose challenges when dealing with cognitive models involving complex or unknown generative processes. Here's a breakdown of the limitations and potential solutions:
Limitations:
Assumption of Known Distributions: Likelihood-based methods often assume that the data-generating process follows a specific probability distribution (e.g., normal, log-logistic). This assumption might not hold for complex cognitive processes, leading to inaccurate model evaluation and parameter estimates.
Difficulty in Specifying Distributions: For some cognitive models, especially those involving intricate interactions between modules or non-linear dynamics, deriving the exact probability distributions of model predictions can be extremely challenging or even impossible.
Solutions and Mitigations:
Distribution-Free Methods: Explore distribution-free or non-parametric methods that do not rely on specific distributional assumptions. These methods, such as bootstrapping or permutation tests, can provide robust model comparisons even when the underlying distributions are unknown.
Empirical Distribution Estimation: Instead of assuming a theoretical distribution, estimate the empirical distribution of model predictions through Monte Carlo simulations. Techniques like kernel density estimation can then be used to approximate the probability density function from the simulated data.
Approximate Bayesian Computation (ABC): ABC methods provide a powerful alternative when likelihoods are intractable. Instead of calculating likelihoods directly, ABC compares simulated data from the model to the observed data using summary statistics. This allows for model evaluation and parameter estimation without explicit likelihood calculations.
Model Simplification: In some cases, simplifying the cognitive model or focusing on specific aspects of the data-generating process might make it feasible to derive approximate probability distributions. While simplification comes with a trade-off in terms of realism, it can make likelihood-based methods more tractable.
Example: Consider a cognitive model of decision-making under uncertainty that incorporates complex heuristics and learning mechanisms. Deriving the exact probability distribution of choices made by this model might be very difficult. In this case, researchers could use Monte Carlo simulations to generate a large sample of simulated choices and then employ kernel density estimation to approximate the distribution. Alternatively, they could use ABC methods, comparing summary statistics of the simulated and observed choices to evaluate the model.

The insights gained from likelihood-based model evaluation can be instrumental in guiding the development of more accurate and sophisticated cognitive architectures. Here's how:
1. Identifying Model Misfit:
Pinpointing Discrepancies: Likelihood-based methods can reveal specific areas where a model's predictions deviate significantly from observed data. This can involve analyzing individual-level fits, comparing model performance across experimental conditions, or examining the influence of different parameters.
Guiding Model Revision: By pinpointing areas of misfit, researchers can identify potential weaknesses in the cognitive architecture's assumptions or mechanisms. This can lead to targeted revisions, such as incorporating new modules, refining existing equations, or introducing additional parameters.
2. Parameter Interpretation and Constraint:
Parameter Sensitivity Analysis: Likelihood profiles and confidence intervals can reveal how sensitive the model's fit is to changes in specific parameters. Parameters to which the model is highly sensitive might reflect critical cognitive processes that require further investigation.
Constraining Theories: Parameter estimates can provide quantitative constraints on cognitive theories. For example, if a model consistently requires extremely low decay rates to fit human memory data, it might challenge theories that assume rapid forgetting.
3. Model Comparison and Selection:
Quantitative Model Comparison: Likelihood ratios and Bayes Factors offer a statistically principled way to compare the relative fit of competing cognitive architectures or alternative versions of the same architecture.
Theory Adjudication: Model comparison based on likelihood can help researchers adjudicate between competing cognitive theories. By comparing models that embody different theoretical assumptions, researchers can gain evidence in favor of one theory over another.
4. Guiding New Experimentation:
Generating Predictions: Well-fitting models, parameterized using likelihood-based methods, can be used to generate precise predictions for new experimental conditions or populations.
Hypothesis-Driven Research: These predictions can guide the design of future experiments, leading to a more hypothesis-driven approach to cognitive architecture research.
Example: Imagine a cognitive architecture model of attention that consistently underestimates the impact of distractors in visual search tasks. Likelihood-based analysis might reveal that the model's parameters related to distractor suppression are consistently estimated to be weaker than expected. This insight could lead researchers to revise the architecture by incorporating a more robust distractor suppression mechanism or by refining the equations governing attentional selection.

0