Core Concepts
LPSNet is the first end-to-end framework that can directly recover 3D human poses and shapes from lensless imaging measurements, without the need for intermediate image reconstruction.
Abstract
The key highlights and insights of the content are:
Lensless imaging systems offer several advantages over traditional cameras, such as privacy protection, smaller size, simpler structure, and lower cost. However, directly estimating human pose and shape from lensless measurements is challenging due to the inherent ambiguity of the captured data.
The authors propose LPSNet, the first end-to-end framework for human pose and shape estimation from lensless measurements. LPSNet consists of three main components:
A Multi-Scale Lensless Feature Decoder (MSFDecoder) that can effectively decode the information encoded by the lensless imaging system.
A human parametric model regressor that takes the multi-scale features produced by MSFDecoder and predicts the SMPL parameters.
A Double-Head Auxiliary Supervision (DHAS) mechanism that improves the estimation accuracy of human limbs.
The authors establish a lensless imaging system and collect various datasets, including real and simulated lensless measurements, to evaluate their method. Experimental results show that LPSNet outperforms a baseline approach that first reconstructs images from lensless measurements and then estimates pose and shape.
The authors discuss the limitations of their approach, such as difficulties in handling complex human poses and occlusions, and suggest future work to address these challenges.