核心概念
This work introduces a novel motion prediction model that predicts Gaussian joint probability density functions (PDFs) for all agent-pairs in a scene, enabling comprehensive statistical analysis of agent interactions and risk assessment.
摘要
The paper presents a multi-agent motion prediction model called MAP-FORMER that goes beyond current practices of predicting marginal trajectories or Gaussian PDFs for individual agents. The key innovation is the ability to predict covariance matrices for agent-pairs, which allows modeling Gaussian joint PDFs for all relevant agent-pairs in a scene.
The model consists of four main modules:
- Temporal Encoder (TEnc): Encodes past trajectories of all agents using a Transformer encoder.
- Spatial and Interaction Encoder (SaIEnc): Captures structural and relational information of the scene using either a GNN-based or Transformer-based architecture.
- Factorized Transformer Decoder: Aggregates information from the TEnc and SaIEnc to generate agent embeddings.
- Multihead Agent-Pair Prediction: Predicts multiple trajectory modes per agent and the parameters of the covariance matrices for all agent-pairs.
The covariance matrix prediction is formulated to guarantee symmetry and positive-definiteness, enabling the construction of Gaussian joint PDFs. This provides rich statistical information about agent dependencies and interactions, which is crucial for comprehensive risk assessment in autonomous driving.
The authors evaluate their model on the rounD dataset, which contains highly interactive roundabout scenarios. The results show that the MAP-FORMER (full) model, which combines the TEnc and Transformer-based SaIEnc, outperforms both joint and marginal prediction baselines in standard metrics.
The paper concludes by discussing the potential of the predicted agent-pair covariance matrices for statistical analysis of agent interactions and risk assessment, which will be the focus of future work.
统计
The maximum number of agents recorded in the rounD dataset for a single frame is 25.
The model predicts trajectory points in a frequency of 5 Hz and provides 1 s of history to the model.
引用
"There is a gap in risk assessment of trajectories between the trajectory information coming from a traffic motion prediction module and what is actually needed. Closing this gap necessitates advancements in prediction beyond current practices."
"Existing prediction models yield joint predictions of agents' future trajectories with uncertainty weights or marginal Gaussian probability density functions (PDFs) for single agents. Although, these methods achieve high accurate trajectory predictions, they only provide little or no information about the dependencies of interacting agents."