Unified Transformer Model for Predicting Human Attention Scanpaths in Visual Search and Free Viewing
A single transformer-based model, Human Attention Transformer (HAT), can effectively predict human scanpaths in both top-down visual search and bottom-up free viewing tasks, outperforming previous state-of-the-art methods.