BlockFusion is a diffusion-based model that generates 3D scenes as unit blocks and seamlessly incorporates new blocks to extend the scene. It leverages a latent tri-plane representation and a denoising diffusion process to produce diverse, geometrically consistent, and unbounded large 3D scenes with high-quality shapes.
A novel attention-based conditional variational autoencoder (cVAE) model that generates diverse and plausible 3D scene layouts from input scene graphs.
The proposed method generates 3D scenes by integrating partial images, layout information represented in the top view, and text prompts as input conditions in a complementary manner, addressing the limitations of existing methods that rely on a single condition.
AnyHomeは、自由なテキスト入力から立体的で構造化された屋内シーンを生成する新しいフレームワークです。