Основні поняття
The proposed GDTS framework integrates goal estimation and a novel two-stage tree sampling diffusion model to generate accurate and diverse multi-modal pedestrian trajectory predictions in real-time.
Анотація
The paper presents a novel framework called GDTS (Goal-Guided Diffusion Model with Tree Sampling) for multi-modal pedestrian trajectory prediction. The key components are:
-
Goal Estimation Module:
- Predicts the probability distribution of the pedestrian's goal position using a U-Net architecture.
- Samples multiple possible goals from this distribution to ensure diversity in the predictions.
-
Trajectory Prediction Module:
- Takes the history trajectory and the estimated goals as input.
- Leverages a diffusion-based model to generate multi-modal future trajectory predictions.
- Introduces a two-stage tree sampling algorithm to accelerate the inference speed without compromising accuracy:
- Trunk stage: Uses a common feature to generate a roughly denoised initial trajectory.
- Branch stage: Further refines the initial trajectory using diverse features to obtain multiple modalities.
The experiments on the ETH/UCY and Stanford Drone datasets demonstrate that GDTS achieves state-of-the-art performance in terms of prediction accuracy while maintaining real-time inference speed. The ablation studies validate the effectiveness of the proposed tree sampling algorithm and the combination of diffusion models.
Статистика
The paper reports the following key metrics:
Average Displacement Error (ADE20) and Final Displacement Error (FDE20) on the Stanford Drone Dataset and ETH/UCY dataset.
Inference time for the proposed method and baselines.
Цитати
"Accurate prediction of pedestrian trajectories is crucial for improving the safety of autonomous driving."
"To address these challenges and facilitate the use of diffusion models in multi-modal trajectory prediction, we propose GDTS, a novel Goal-Guided Diffusion Model with Tree Sampling for multi-modal trajectory prediction."
"Experimental results demonstrate that our proposed framework achieves comparable state-of-the-art performance with real-time inference speed in public datasets."