The paper introduces Hyper-STTN, a novel framework for human trajectory prediction that combines the strengths of hypergraph neural networks and spatial-temporal transformers. The key highlights are:
Hyper-STTN constructs a set of multiscale hypergraphs to model group-wise social interactions among pedestrians, capturing latent correlations and dependencies within and across groups.
It employs spatial and temporal transformer networks to effectively represent pair-wise spatial and temporal interactions between individual agents.
The heterogeneous group-wise and pair-wise features are then fused through a multi-modal transformer to align the spatial-temporal embeddings.
Finally, a conditional variational autoencoder (CVAE) is used to decode the crowd dynamics representation and generate stochastic trajectory predictions.
The proposed Hyper-STTN framework outperforms state-of-the-art baselines on several public pedestrian trajectory datasets, demonstrating its effectiveness in modeling complex social interactions for accurate crowd movement forecasting.
Іншою мовою
із вихідного контенту
arxiv.org
Ключові висновки, отримані з
by Weizheng Wan... о arxiv.org 09-19-2024
https://arxiv.org/pdf/2401.06344.pdfГлибші Запити