The paper proposes a novel framework called Latent Embedding Clustering for Head Pose Estimation (LEC-HPE) that addresses the challenge of occlusions in head pose estimation. The key aspects are:
Unsupervised latent embedding clustering: The model optimizes latent feature representations for occluded and non-occluded images through a clustering term, without requiring labeled embedding data for each training image. This allows for more efficient training compared to prior work.
Fine-grained Euler angle estimation: The model incorporates a multi-loss scheme with classification and regression components for each Euler angle (yaw, pitch, roll) to ensure accurate fine-grained pose predictions.
Two-stage training: The first stage initializes the model parameters and feature space, while the second stage performs clustering and latent space fine-tuning.
Extensive experiments on benchmark datasets (BIWI, AFLW2000, Pandora) demonstrate that LEC-HPE achieves competitive performance compared to state-of-the-art methods, while significantly reducing the required ground truth data. The ablation study confirms the importance of the clustering term in improving occlusion robustness.
To Another Language
from source content
arxiv.org
สอบถามเพิ่มเติม