The paper presents CtRL-Sim, a framework for generating controllable and reactive driving agent behaviors in simulation. The key insights are:
CtRL-Sim employs return-conditioned offline reinforcement learning to model the joint distribution of agent actions and returns. This allows for fine-grained control over agent behaviors by exponentially tilting the predicted return distribution.
The CtRL-Sim architecture is based on an autoregressive multi-agent Decision Transformer that predicts the sequence of future states, actions, and returns-to-go. This model-based approach provides a useful regularizing signal.
The Nocturne simulator is extended with a Box2D physics engine to enable realistic vehicle dynamics and collision interactions.
The paper demonstrates that CtRL-Sim can efficiently generate diverse and realistic safety-critical scenarios while providing intuitive control over agent behaviors through exponential tilting of the predicted return distribution. Finetuning CtRL-Sim on simulated long-tail scenarios further enhances its ability to generate targeted adversarial behaviors.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문