Core Concepts
PoIFusion presents a novel multi-modal 3D object detection framework that efficiently fuses RGB images and LiDAR point clouds at points of interest.
Abstract
PoIFusion introduces a query-based approach to multi-modal 3D object detection, focusing on feature fusion at Points of Interest (PoIs). By adaptively generating PoIs from object queries, the framework integrates multi-modal features through dynamic fusion blocks. Extensive experiments on the nuScenes dataset demonstrate PoIFusion's state-of-the-art performance in achieving 74.9% NDS and 73.4% mAP. The method preserves modal-specific information, improves feature alignment, and reduces computation overhead compared to existing approaches.
Stats
Our PoIFusion achieves 74.9% NDS and 73.4% mAP on the nuScenes dataset.
Extensive experiments were conducted on the nuScenes dataset to evaluate our approach.
The proposed PoIFusion sets a new record for multi-modal 3D object detection benchmark.