Vision transformers with shifted window attention improve voxel 3D reconstruction accuracy.


coremsg

r3d-swin-voxel-3d-reconstruction-with-shifted-window-attention


R3D-SWIN: Voxel 3D Reconstruction with Shifted Window Attention


title_rewrite


Efficiently reconstruct 3D assets from single images using Gamba, leveraging Gaussian splatting and Mamba for speed and quality.


efficient-3d-reconstruction-with-gamba-gaussian-splatting-and-mamba


Efficient 3D Reconstruction with Gamba: Gaussian Splatting and Mamba



VistaDream reconstructs high-quality, consistent 3D scenes from single-view images by leveraging a novel two-stage pipeline that combines the strengths of diffusion models and vision-language models, outperforming existing methods without requiring fine-tuning.


vistadream-a-two-stage-framework-for-single-view-3d-scene-reconstruction-using-diffusion-models-and-vision-language-models


VistaDream: A Two-Stage Framework for Single-View 3D Scene Reconstruction Using Diffusion Models and Vision-Language Models