Can Foundation Models Grasp Space? ‘Theory of Space’ Benchmark Reveals Exploration Bottleneck

A new benchmark, “Theory of Space,” has been introduced to test whether foundation models can construct, revise, and exploit spatial beliefs through active exploration. This evaluation aims to understand how advanced AI systems comprehend and navigate the physical world around them.

Across six state-of-the-art models, researchers identified significant limitations. These include a critical exploration bottleneck, a persistent gap between text and vision modalities, and severe deficiencies in belief acquisition and revision regarding spatial information. Current models struggle to efficiently learn from limited environmental interactions.

This research highlights crucial areas where AI’s spatial reasoning needs improvement, paving the way for future developments in AI that can achieve more human-like understanding of their surroundings.