Centrala begrepp
A gated synthetic-to-real knowledge transfer approach (Gated-S2R-PCP) is proposed to effectively leverage diverse synthetic data for pedestrian crossing prediction in real-world driving scenes.
Sammanfattning
The paper proposes a Gated Syn-to-Real Knowledge Transfer approach for Pedestrian Crossing Prediction (Gated-S2R-PCP) to address the limited observations of pedestrian crossing behaviors in real-world driving datasets.
The key insights are:
- The domain gaps vary for different types of information (pedestrian locations, RGB frames, depth/semantic images) between synthetic and real datasets.
- Gated-S2R-PCP incorporates three differentiated knowledge transfer methods - Knowledge Distiller, Style Shifter, and Distribution Approximator - to adaptively transfer the suitable synthetic knowledge to the real dataset.
- A Learnable Gated Unit (LGU) is introduced to fuse the transferred knowledge from the three modules, enabling an end-to-end adaptive knowledge transfer for pedestrian crossing prediction.
- A large-scale synthetic dataset S2R-PCP-3181 is constructed, containing pedestrian locations, RGB frames, depth, and semantic images. Gated-S2R-PCP shows superior performance on real-world datasets JAAD and PIE compared to state-of-the-art methods.
Statistik
The synthetic S2R-PCP-3181 dataset contains 3,181 video sequences with 489,740 frames.
The real-world JAAD dataset contains 346 video sequences with 75K frames and 2,786 pedestrians.
The real-world PIE dataset contains 55 sequences with 293K frames and 1,834 pedestrians.
Citat
"About 50% road crashes involve vulnerable road users (pedestrians, cyclists, and motorbikes) each year [1]. Therefore, safety must be maintained towards automatic or intelligent vehicles, prioritized for the most vulnerable road users [2]."
"Observations: To illustrate the distinct domain gaps for different information in the PCP task, Fig. 1(a) plots the feature distributions of pedestrian locations, RGB frames, and semantic and depth images."