The paper proposes a trajectory planning method for autonomous vehicles using reinforcement learning (RL) that addresses the limitations of traditional trajectory planning approaches and RL-based methods.
The key aspects of the proposed method are:
Reward Prediction (RP): The method uses RP to stabilize the learning process by predicting the expected future rewards instead of using the actual returns, which can have high variance.
Iterative Reward Prediction (IRP): To further improve the accuracy of reward prediction, the method uses IRP, which iteratively plans new trajectories at the predicted states and predicts the rewards of those trajectories.
Uncertainty Propagation: The method considers the uncertainties in the perception, prediction, and control modules by propagating the uncertainty of the predicted states of the ego vehicle and other traffic participants. This is done using a Kalman filter-based approach and Minkowski sum to check for collisions.
The proposed method is evaluated in the CARLA simulator across various scenarios, including lane following, lane changing, and navigating through static obstacles and other traffic participants. Compared to baseline methods, the proposed method with IRP and uncertainty propagation shows significant improvements in terms of reduced collision rate and increased average reward.
The key highlights and insights from the paper are:
Till ett annat språk
från källinnehåll
arxiv.org
Djupare frågor