MVLight, a novel light-conditioned multi-view diffusion model, enhances text-to-3D generation by integrating lighting conditions directly into the generation process, resulting in higher-quality 3D models with improved geometric precision and superior relighting capabilities compared to existing methods.
DreamPolish is a novel text-to-3D generation model that leverages progressive geometry generation and domain score distillation to produce 3D objects with refined geometry and high-quality, photorealistic textures, outperforming existing state-of-the-art methods.
This paper surveys the rapidly developing field of text-to-3D generation, exploring core technologies, seminal methods, enhancement directions, and applications, ultimately highlighting its potential to revolutionize 3D content creation.
Layout-Your-3D enables efficient and controllable generation of complex 3D scenes from text prompts by leveraging 2D layouts as blueprints, outperforming existing methods in speed and accuracy.
JointDreamer introduces Joint Score Distillation (JSD), a novel method that enhances 3D consistency in text-to-3D generation by modeling inter-view coherence, effectively addressing the limitations of traditional Score Distillation Sampling (SDS).
Semantic Score Distillation Sampling (SemanticSDS) improves compositional text-to-3D generation by incorporating semantic embeddings and region-specific denoising, enabling the creation of complex scenes with multiple, detailed objects and interactions.
SeMv-3D is a novel framework that leverages triplane priors and a two-step learning process to generate semantically consistent and multi-view coherent 3D objects from text descriptions.
BoostDream, a novel method that seamlessly combines differentiable rendering and text-to-image advancements, can efficiently refine coarse 3D assets into high-quality 3D content guided by text prompts.
DreamMesh is a novel text-to-3D generation framework that leverages explicit 3D representation of triangle meshes to produce high-quality 3D models with clean geometry and rich texture details.
TPA3D, a GAN-based deep learning framework, can efficiently generate high-quality 3D textured meshes that closely align with detailed text descriptions, without relying on human-annotated text-3D pairs for training.