Core Concepts
Augmented Object Intelligence (AOI) is a novel XR interaction paradigm that blurs the lines between digital and physical by equipping real-world objects with the ability to interact as if they were digital, where every object has the potential to serve as a portal to vast digital functionalities.
Abstract
The paper introduces Augmented Object Intelligence (AOI), a novel XR interaction paradigm that aims to seamlessly integrate physical objects as interactive digital entities. The key highlights are:
AOI leverages object segmentation, classification, and Multimodal Large Language Models (MLLMs) to facilitate rich interactions with physical objects in XR environments.
The authors implement the AOI concept in the form of XR-Objects, an open-source prototype system that enables users to engage with their physical environment in contextually relevant ways. XR-Objects allows analog objects to not only convey information but also to initiate digital actions.
The system architecture combines object detection using MediaPipe, 3D localization and anchoring using ARCore/ARKit, and object-specific MLLM instances to provide detailed information and enable a variety of interactions, such as querying for details, comparing objects, setting timers, and adding notes.
The authors conduct a user study comparing XR-Objects to a state-of-the-art MLLM assistant interface, demonstrating significant improvements in task completion time and user experience metrics like ease of use and satisfaction.
The paper outlines diverse application scenarios for XR-Objects, including discovery, productivity, learning, IoT connectivity, and cooking, showcasing the potential of AOI to transform how users interact with their physical surroundings.
Stats
XR-Objects took an average of 217.5 seconds to complete all tasks, compared to 286.3 seconds for the Chatbot interface, a 31% improvement.
The Ease of Information Retrieval showed high skewness for both XR-Objects (γ1 = 1.19) and Chatbot (γ1 = 1.8), indicating strong positive ratings.
The Tool Ease responses were highly skewed for Chatbot (γ1 = 2.25) but not for XR-Objects (γ1 = 0.03), suggesting better perceived ease of use for the prototype.
Quotes
"Augmented Object Intelligence (AOI) is a novel XR interaction paradigm that blurs the lines between digital and physical by equipping real-world objects with the ability to interact as if they were digital, where every object has the potential to serve as a portal to vast digital functionalities."
"XR-Objects embodies this idea and aims to demonstrate and investigate "semantic equality" between real and virtual objects."