Li, Y., Han, Q., Yu, M., Jiang, Y., Yeo, C. K., Li, Y., Huang, Z., Liu, N., Chen, H., & Wu, X. (2024). Towards Efficient 3D Object Detection in Bird's-Eye-View Space for Autonomous Driving: A Convolutional-Only Approach. arXiv preprint arXiv:2312.00633v2.
This paper introduces BEVENet, a novel 3D object detection framework for autonomous driving that prioritizes efficiency without sacrificing accuracy. The authors aim to address the computational limitations of existing Vision Transformer (ViT)-based methods by proposing a convolutional-only architecture.
BEVENet employs a six-module structure: a shared ElanNet backbone with NuImage pretraining, an LSS view projection module with a lookup table, a fully-convolutional depth estimation module with data augmentation, a temporal module with a 2-second history, a BEV feature encoder with residual blocks, and a simplified detection head with Circular NMS. The model is evaluated on the NuScenes dataset using metrics like mAP, NDS, FPS, and GFlops.
BEVENet achieves state-of-the-art efficiency with a GFlops count of 161.42 and an inference speed of 47.6 FPS, significantly outperforming existing methods. It also demonstrates competitive accuracy with an mAP of 45.6 and an NDS of 55.5. The ablation study highlights the contribution of each design choice to the model's efficiency and accuracy.
This research demonstrates that a convolutional-only architecture can achieve state-of-the-art efficiency and competitive accuracy for 3D object detection in BEV space. This finding is significant for deploying such systems in real-world autonomous vehicles with limited computational resources.
BEVENet's efficiency and accuracy make it a promising solution for real-world autonomous driving applications. The study highlights the potential of convolutional architectures in resource-constrained environments and paves the way for further research in this direction.
The authors acknowledge the need to explore the role of multi-view image inputs and the significance of Region-of-Interest within BEV for further performance and efficiency improvements.
Vers une autre langue
à partir du contenu source
arxiv.org
Questions plus approfondies