The paper presents a novel approach for high-fidelity and transferable photorealistic editing of 3D scenes represented by neural radiance fields (NeRFs). The key insight is that the low-frequency components of images, which predominantly define the appearance style, exhibit enhanced multi-view consistency after editing compared to their high-frequency counterparts.
The proposed framework comprises two main branches: a high-frequency branch that preserves the content details, and a low-frequency branch that performs the style editing in the feature space. The low-frequency branch first extracts the low-frequency feature from the full scene feature map using a low-pass filter. Then, a stylization network edits the low-frequency feature according to the desired style. Finally, the edited low-frequency component is blended with the high-frequency details from the original scene to obtain the high-fidelity edited image.
This frequency-decomposed approach offers several advantages:
The experiments demonstrate the superior performance of the proposed method in terms of multi-view consistency, image quality, and sharpness compared to previous NeRF editing approaches.
翻譯成其他語言
從原文內容
arxiv.org
深入探究