The paper presents CtRL-Sim, a framework for generating controllable and reactive driving agent behaviors in simulation. The key insights are:
CtRL-Sim employs return-conditioned offline reinforcement learning to model the joint distribution of agent actions and returns. This allows for fine-grained control over agent behaviors by exponentially tilting the predicted return distribution.
The CtRL-Sim architecture is based on an autoregressive multi-agent Decision Transformer that predicts the sequence of future states, actions, and returns-to-go. This model-based approach provides a useful regularizing signal.
The Nocturne simulator is extended with a Box2D physics engine to enable realistic vehicle dynamics and collision interactions.
The paper demonstrates that CtRL-Sim can efficiently generate diverse and realistic safety-critical scenarios while providing intuitive control over agent behaviors through exponential tilting of the predicted return distribution. Finetuning CtRL-Sim on simulated long-tail scenarios further enhances its ability to generate targeted adversarial behaviors.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Luke Rowe,Ro... lúc arxiv.org 04-01-2024
https://arxiv.org/pdf/2403.19918.pdfYêu cầu sâu hơn