Core Concepts
Proposing DDMI, a domain-agnostic latent diffusion model for synthesizing high-quality implicit neural representations across various signal domains.
Abstract
Abstract:
Introducing DDMI to address limitations in existing generative models for implicit neural representations (INRs).
Proposing adaptive positional embeddings instead of neural networks' weights.
Demonstrating superior performance compared to existing INR generative models across four modalities.
Introduction:
INRs provide flexibility and expressivity in representing arbitrary signals.
Recent research focuses on INR generative models using Normalizing Flows, GANs, and Diffusion Models.
Existing models exhibit limitations in achieving high-quality results due to fixed positional embeddings.
Methodology:
Presenting DDMI for synthesizing high-quality INRs with adaptive positional embeddings.
Introducing Discrete-to-continuous space Variational AutoEncoder (D2C-VAE) and Hierarchically Decomposed Basis Fields (HDBFs).
Describing the training procedure involving VAE training and diffusion model training.
Experiments:
Evaluating DDMI across 2D images, 3D shapes, and videos.
Comparing results with domain-specific and domain-agnostic baselines.
Conducting quantitative analysis using metrics like FID, MMD, COV, and qualitative analysis through visualizations.
Analysis:
Analyzing the decomposition of HDBFs to capture signals of different scales effectively.
Conducting an ablation study to evaluate the impact of each component in DDMI.
Conclusion:
Summarizing the effectiveness of DDMI in synthesizing high-quality INRs across various signal domains.
Stats
"Extensive experiments across four modalities, e.g., 2D images, 3D shapes,
Neural Radiance Fields, and videos"
"Code is available at https://github.com/mlvlab/DDMI."