This one is finally out!
We introduce g-distance, a measure we derive with Set Theory that assesses model adequacy by comparing the range of behaviours exhibited by models to that of humans. #computationalmodeling
#cognitivescience
g-distance comprises of two easily interpretable dimensions: accommodation (α), the proportion of observed human behaviours that the model can produce, and excess flexibility (β), the proportion of unobserved behavioural patterns that the model produces.
To determine the range of a model's behaviour, we use parameter-space partitioning (PSP), a sophisticated MCMC. This method explores the model's parameter space to discover all the distinct patterns of behaviour it can generate. It slices up model behaviour into distinct ordinal behaviours.
We applied g-distance to five models of the inverse base-rate effect (IBRE), an irrational learning phenomenon. These models included exemplar-based attentional models of learning, simpler neural networks, ones with competitive gating or rapid attentional shifts, and dissimilarity context models.
Our analysis revealed that two models, a neural network with rapid attention shifts and a dissimilarity generalised context mode, outperformed the previous market-leader EXIT, based on g-distance. An outcome that holds for a wide array of beliefs about the relative importance of alpha and beta.
Interestingly, we found that some models could accommodate human behaviours in ways that were not intuitively obvious, highlighting the importance of formally expressing psychological theories and systematically exploring their capabilities
We also compared g-distance with the Bayesian information criterion (BIC). Strikingly, BIC misidentified a known-poor model as the best. Unlike g, BIC is agnostic to the qualitative change in model’s performance and cannot communicate or capture this interplay between flexibility and accommodation.
g-distance offers a novel framework for evaluating computational models by considering not just how well models produce observed behaviours, but how well they reject unobserved ones.
The framework encourages a shift in focus towards understanding the full behavioural repertoire of models and their alignment with the diversity observed in human behaviour, paving the way for more robust model evaluation and theory development.