Core Concepts
The author presents a method for direct 2D-to-3D registration of microscope video to pre-operative CT scans without external tracking equipment, aiming to improve cochlear implant surgery outcomes through augmented reality.
Abstract
Augmented reality (AR) can enhance cochlear implant (CI) surgeries by overlaying important information onto the surgical scene. The proposed method involves surface mapping of the incus and using pose estimation for accurate registration. By achieving an average rotation error of less than 25 degrees and translation errors under 2 mm, this approach shows promise in improving surgical accuracy. The technique focuses on monocular images, addressing challenges like insufficient data and limited visibility in surgical scenarios. By adapting neural networks and utilizing deep learning techniques, the method aims to pave the way for AI-powered AR in various surgeries.
Stats
Our results demonstrate the accuracy with an average rotation error of less than 25 degrees and a translation error of less than 2 mm, 3 mm, and 0.55% for the x, y, and z axes, respectively.
Our training dataset contains nine video frames, while our validation dataset includes three, each from unique surgical cases.
To enhance our training dataset size, we use data augmentation techniques such as flipping, rotating, and translating, thus expanding our data size by a factor of 1000 with synthetically generated data.
The z-axis translation error is calculated in percentage based on the estimated focal length.