核心概念
Robust and safe reinforcement learning framework that incorporates general environment disturbances using optimal transport cost uncertainty set, with an efficient implementation based on applying Optimal Transport Perturbations to construct worst-case virtual state transitions.
要約
The content presents a robust and safe reinforcement learning (RL) framework that incorporates general environment disturbances using an optimal transport cost uncertainty set. The key highlights are:
- Formulation of a safe RL framework that provides robustness to general disturbances using the optimal transport cost between transition models.
- Theorem 1 shows that the resulting worst-case optimization problems over transition models can be reformulated as adversarial perturbations to state transitions in the training environment.
- Proposal of an efficient deep RL implementation of Optimal Transport Perturbations, which are used to construct worst-case virtual state transitions without impacting data collection during training.
- Experimental results on continuous control tasks with safety constraints demonstrate that the use of Optimal Transport Perturbations leads to robust performance and safety both during training and in the presence of disturbances, outperforming standard safe RL, adversarial RL, domain randomization, and distributionally robust safe RL approaches.
The framework makes limited assumptions on the data collection process during training and does not require directly modifying the environment, making it compatible with many real-world decision making applications.
統計
The content does not contain any explicit numerical data or metrics. The key figures and statistics are presented in the form of relative performance comparisons and percentage of safety constraint satisfaction across different algorithms and test environments.