MindSet: Vision is a toolbox aimed at facilitating the testing of deep neural networks (DNNs) on visual psychological phenomena. It provides a large, easily accessible, and highly configurable set of 30 image datasets covering a wide array of well-replicated visual experiments and phenomena reported in psychology.
The datasets span low-level vision (e.g., Weber's law), mid-level vision (e.g., Gestalt effects), visual illusions, and object recognition tasks. Each dataset can be easily regenerated with different configurations (image size, background color, stroke color, number of samples, etc.), offering great versatility for different research contexts.
To enable experimentation, the toolbox provides scripts for three testing methods: Similarity Judgment Analysis, Decoder Approach, and Out-of-Distribution classification. These methods allow researchers to systematically evaluate how well DNNs capture key aspects of human visual perception, going beyond the typical observational benchmarks.
The authors provide examples illustrating the use of these methods with a classic feed-forward CNN (ResNet-152), and the code is extensively documented to facilitate adoption and extension by the research community.
By bridging the gap between computational modeling and psychological research, MindSet: Vision aims to drive further interest in testing DNN models against key experiments reported in psychology, in order to better characterize DNN-human alignment and build better DNN models of human vision.
Egy másik nyelvre
a forrásanyagból
arxiv.org
Mélyebb kérdések