The paper introduces a novel one-stage end-to-end multi-person 2D pose estimation algorithm called Joint Coordinate Regression and Association (JCRA). The key highlights are:
JCRA directly predicts full-body pose coordinates without requiring any post-processing steps like keypoint grouping, non-maximum suppression, or heatmap refinement. This simplifies the pipeline and improves efficiency.
JCRA employs a symmetric network architecture with an equal number of encoder and decoder layers, which ensures high accuracy in identifying keypoints by effectively translating abstractions back into concrete forms.
Extensive experiments on the MS COCO and CrowdPose benchmarks demonstrate that JCRA outperforms state-of-the-art approaches in both accuracy and efficiency. JCRA achieves 69.2 mAP on COCO, surpassing previous one-stage end-to-end methods, and is 78% faster at inference than previous state-of-the-art bottom-up algorithms.
JCRA is robust and can handle a wide range of poses, including viewpoint changes, occlusions, and crowded settings, making it suitable for real-world applications.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Dongyang Yu,... at arxiv.org 04-22-2024
https://arxiv.org/pdf/2307.01004.pdfDeeper Inquiries