Core Concepts
SARA integrates eye tracking and large language models within a mixed reality framework to provide personalized, real-time support for users struggling with reading comprehension, enabling them to overcome challenges like unfamiliar vocabulary and complex sentences.
Abstract
The paper introduces SARA (Smart AI Reading Assistant), a mixed reality (MR) system that leverages eye tracking and state-of-the-art large language models (LLMs) to enhance the reading experience by offering personalized assistance in real-time.
Key components of SARA:
Text position identification: SARA uses a QR code to accurately locate the text the user is reading and place virtual markers in the corresponding positions.
Text extraction: SARA captures frames from the user's field of view, crops the region of interest, and applies optical character recognition (OCR) to extract the text content.
Gaze tracking and alignment: SARA tracks the user's eye movements and aligns the gaze data with the extracted text to identify the user's focus and potential areas of reading difficulty.
Reading difficulty classification: SARA detects reading challenges by analyzing gaze patterns, such as increased dwell time on unfamiliar words and regressions in reading patterns.
Reading support: SARA utilizes GPT-4 to provide personalized assistance, such as definitions, translations, and paraphrasing, to help users overcome comprehension difficulties.
Seamless integration: SARA presents the support solutions as virtual overlays within the user's augmented reality environment.
The paper highlights the potential of SARA to transform the reading experience and improve reading proficiency by leveraging the capabilities of eye tracking, LLMs, and MR technology.