Core Concepts
TrajFlow, a novel model for probabilistic trajectory prediction, learns distributions over abstracted trajectory features using Normalizing Flows to effectively capture the variability inherent in human behavior.
Abstract
The paper proposes TrajFlow, a new approach for probabilistic trajectory prediction based on Normalizing Flows. The key idea is to reformulate the problem of capturing distributions over trajectories into capturing distributions over abstracted trajectory features using an autoencoder, which simplifies the learning task of the Normalizing Flows.
The paper first provides background on Normalizing Flows, which are a family of generative methods that enable exact likelihood computation by transforming distributions through a series of differentiable bijective functions into a simple known "base" distribution.
The TrajFlow model consists of three main components:
A Normalizing Flow that learns the distribution of the encoded future trajectories rather than the raw future trajectories.
A Recurrent Neural Network Autoencoder (RNN-AE) that generates an intermediate representation of the trajectories, capturing the most relevant features.
Encoding components for the target agent's past trajectory, social interactions, and static environment information.
The authors validate their approach on two synthetic datasets with known true distributions, as well as on the ETH/UCY, rounD, and nuScenes real-world datasets. The results demonstrate that TrajFlow outperforms state-of-the-art behavior prediction models in capturing full trajectory distributions, especially on the more variable pedestrian datasets. The use of the RNN-AE is shown to be a key factor in the improved performance.
Additionally, the auto-regressive nature of the TrajFlow decoder provides flexibility in terms of the possible length of the predictions, which is particularly useful for scenarios that require a longer planning horizon, such as when approaching a roundabout.
Stats
The paper reports the following key metrics:
minADE (Average L2 distance between the best-predicted trajectory and the ground truth)
minFDE (Final L2 distance between the best-predicted trajectory and the ground truth)
NLL (Negative Log-Likelihood of the ground truth according to the learned distribution)
DJS (Jensen-Shannon divergence between the ground truth distribution and the learned distribution)
Quotes
"Predicting the future behavior of human road users is an important aspect for the development of risk-aware autonomous vehicles."
"An example of such multi-modality can be seen at roundabouts, where vehicles have the option to enter the roundabout directly or to wait for an oncoming car to pass."
"While these state-of-the-art approaches already achieve good results in prediction accuracy, they have the fundamental problem of being trained to reproduce the only true future trajectory available for each past trajectory in the dataset, thereby ignoring the underlying stochasticity of human behavior."