This research paper introduces HTCL, a novel hierarchical temporal context learning method for camera-based 3D semantic scene completion, which surpasses previous methods by effectively leveraging temporal information to predict detailed 3D layouts from limited 2D images.
TALoS, a novel test-time adaptation method, leverages the sequential nature of LiDAR data in autonomous driving to improve the accuracy of semantic scene completion by using past and future observations as self-supervision for adapting a pre-trained model to unseen driving environments.
This paper introduces CGFormer, a novel neural network architecture for semantic scene completion that leverages context-aware query generation and 3D deformable cross-attention within a voxel transformer framework to improve the accuracy of 3D scene reconstruction from 2D images.
A one-stage camera-based semantic scene completion framework that propagates semantics from semantic-aware seed voxels to the whole scene based on spatial geometry cues.
A novel approach for jointly estimating missing geometry and semantics from sparse LiDAR point clouds using denoising diffusion probabilistic models.
The proposed BRGScene framework effectively bridges stereo geometry and bird's-eye-view (BEV) representation to achieve reliable and accurate semantic scene completion solely from stereo RGB images.
The core message of this paper is to propose a hardness-aware semantic scene completion (HASSC) approach that can effectively improve the accuracy of existing models in challenging regions without incurring extra inference cost. The key innovations are the hard voxel mining (HVM) head that leverages both global and local hardness to focus on hard voxels, and the self-distillation training strategy that enhances the stability and consistency of the model.
SLCF-Net introduces a novel approach for Semantic Scene Completion by fusing LiDAR and camera data, achieving superior performance in SSC metrics.
SLCF-Net introduces a novel approach for Semantic Scene Completion by fusing LiDAR and camera data to estimate missing geometry and semantics in urban driving scenarios.
The author proposes the Adversarial Modality Modulation Network (AMMNet) to address ineffective feature learning and overfitting in semantic scene completion, achieving significant performance improvements.