Scalable Reconstruction of Hand-Held Objects from Monocular RGB Images
A scalable paradigm for reconstructing hand-held objects from monocular RGB images by jointly inferring hand and object geometry, and leveraging large language/vision models for automated 3D object retrieval and alignment.