The content discusses the development of ChatSim, a system that allows users to edit 3D driving scenes using natural language commands. It introduces innovative rendering methods like McNeRF and McLight to enhance photo-realism and flexibility in autonomous driving simulations. The experiments conducted on the Waymo Open Dataset demonstrate the system's ability to handle complex commands and generate realistic scene videos.
Traditional graphics engines like CARLA and UE offer editable virtual environments but lack data realism due to asset modeling limitations. Image generation methods like BEVControl can generate realistic scene images but struggle with view consistency. Rendering-based methods like UniSim and MARS provide scene-editing tools but require extensive user involvement.
ChatSim addresses these limitations by employing collaborative LLM agents for efficient editing of complex driving scenes through natural language commands. McNeRF is introduced for background rendering, while McLight enhances foreground rendering with accurate lighting estimation.
Extensive experiments show that ChatSim outperforms existing methods in generating photo-realistic customized perception data from human language commands. The system achieves state-of-the-art performance in realism and lighting estimation, showcasing its potential for enhancing autonomous driving simulations.
他の言語に翻訳
原文コンテンツから
arxiv.org
深掘り質問