This paper proposes a novel visual gyroscope method that combines an analytical approach to compute spherical moments coefficients with a learning-based optimization to provide efficient and accurate 3D rotation estimation from spherical images.
Computer vision technology, which simulates human visual observation, plays a crucial role in enabling robots to perceive and understand their surroundings, leading to advancements in tasks like autonomous navigation, object recognition, and waste management. By integrating computer vision with robot control, robots gain the ability to interact intelligently with their environment, improving efficiency, quality, and environmental sustainability.
By combining multi-view visual data and tactile sensing information within a 3D Gaussian Splatting framework, the proposed method achieves state-of-the-art geometry reconstruction and novel view synthesis for challenging surfaces, outperforming prior vision-only approaches.
Proposing an open-vocabulary framework for predicting object poses and sizes using text descriptions and a large-scale dataset.