Limitations of Multimodal AI Systems in Spatial Perspective-Taking
Multimodal AI systems, such as GPT-4o, exhibit significant limitations in their ability to perform human-like spatial perspective-taking, particularly on tasks involving mental rotation and alignment with alternative viewpoints.