Core Concepts
This work introduces 3D-Fauna, a method that learns a deformable 3D model for over 100 different quadruped animal species using only 2D Internet images as training data.
Abstract
The paper presents 3D-Fauna, a method that learns a pan-category deformable 3D model for more than 100 different quadruped animal species. The key innovations are:
Semantic Bank of Skinned Models (SBSM): This automatically discovers a small set of base animal shapes by combining geometric inductive priors with semantic knowledge from an off-the-shelf self-supervised feature extractor. This allows the model to capture the diverse shape variations across different animals.
Mask Discriminator: This encourages the predicted 3D shapes to look realistic from arbitrary viewpoints, mitigating the viewpoint bias in Internet images.
Large-scale Fauna Dataset: The authors collected a new dataset of over 78,000 images spanning 128 quadruped species, which is used to train the pan-category 3D model.
At inference, the model can take a single image of any quadruped animal and reconstruct an articulated, textured 3D mesh in a feed-forward manner. Extensive experiments show significant improvements over prior methods, which are limited to reconstructing a single animal category or require additional supervision.
Stats
"Learning 3D models of all animals in nature requires massively scaling up existing solutions."
"We collected a large-scale animal dataset of over 100 quadruped species, dubbed the Fauna Dataset, as part of the contribution."
"The Fauna Dataset contains a total of 78,168 images spanning 128 quadruped species."
Quotes
"Learning 3D models of all animals in nature requires massively scaling up existing solutions."
"We collected a large-scale animal dataset of over 100 quadruped species, dubbed the Fauna Dataset, as part of the contribution."