The authors introduce WinSyn, a unique dataset and testbed for creating high-quality synthetic data with procedural modeling techniques. The dataset contains 75,739 high-resolution photographs of windows from around the world, with 89,318 individual window crops showcasing diverse geometric and material characteristics.
To evaluate the procedural model, the authors train semantic segmentation networks on both synthetic and real images, and compare their performance on a shared test set of real images. They measure the difference in mean Intersection over Union (mIoU) and determine the effective number of real images to match synthetic data's training performance.
The authors design a baseline procedural model as a benchmark and provide 21,290 synthetically generated images. They conduct extensive experiments and ablations to understand the impact of various features in the synthetic dataset on segmentation performance. Key factors such as rendering samples, materials, lighting, camera positions, and window geometry are analyzed.
The authors find that while the procedural model can generate diverse and visually realistic window images, its effectiveness in machine learning applications often falls short compared to real-world imagery. They highlight the challenge of procedural modeling using current techniques, especially in their ability to replicate the spatial semantics of real-world scenarios. This insight is critical because of the potential of procedural models to bridge to hidden scene aspects such as depth, reflectivity, material properties, and lighting conditions.
翻譯成其他語言
從原文內容
arxiv.org
深入探究