Belangrijkste concepten
This paper provides a comprehensive survey of deep learning techniques for event-based vision, covering event representations, quality enhancement, image/video reconstruction and restoration, and scene understanding and 3D vision. It also conducts benchmark experiments and discusses challenges and future research directions in this field.
Samenvatting
The paper first examines the typical event representations and quality enhancement methods, as they play a crucial role as inputs to deep learning models. It then provides a comprehensive survey of existing deep learning-based methods, structurally grouping them into two major categories: 1) image/video reconstruction and restoration, and 2) event-based scene understanding and 3D vision.
For image/video reconstruction and restoration, the paper reviews deep learning methods for event-based intensity reconstruction, event-guided image/video super-resolution, event-guided video frame interpolation, event-guided image/video deblurring, and event-based deep image/video HDR. It discusses the challenges in these tasks, such as scarce labeled data, high computational complexity, and low-quality reconstructed results, and highlights the key insights and advancements made by the representative methods.
For event-based scene understanding and 3D vision, the paper covers deep learning techniques for semantic segmentation, feature tracking, object classification, object detection and tracking, 3D reconstruction, and visual SLAM. It analyzes the performance of these methods and identifies the critical problems that need to be addressed.
The paper also conducts benchmark experiments for some representative research directions, such as image reconstruction, deblurring, and object recognition, to provide insights and identify the critical problems for future studies.
Finally, the paper discusses the challenges and provides new perspectives to inspire more research studies in this field, such as event-based neural radiance for 3D reconstruction, cross-modal learning, and event-based model pretraining.
Statistieken
"Event cameras capture the per-pixel intensity changes asynchronously and produce event streams encoding the time, pixel position, and polarity (sign) of the intensity changes."
"Event cameras offer some other benefits, such as high temporal resolution and high dynamic range."
"Deep learning has received great attention in this emerging area with amounts of techniques developed based on various purposes."
Citaten
"Event cameras have the potential to overcome the limitations of frame-based cameras in the computer vision and robotics community."
"Previously, Gallego et al. [1] provided the first overview of the event-based vision with a particular focus on the principles and conventional algorithms. However, DL has invigorated almost every field of event-based vision recently, and remarkable advancements in methodologies and techniques have been achieved."
"We survey the DL-based methods by focusing on three important aspects: 1) DNN input representations for event data and quality enhancement (Sec. 2); 2) Current research highlights by typically analyzing two hot fields, image restoration and enhancement (Sec.3), scene understanding and 3D vision (Sec. 4); 3) Potential future directions, such as event-based neural radiance for 3D reconstruction, cross-modal learning, and event-based model pertaining (Sec. 5.2)."