Safe Reinforcement Learning

Увійти

ідея - Safe Reinforcement Learning

Verified Safe Reinforcement Learning for Neural Network Dynamic Models Using Finite-Step Reachability

This paper introduces a novel approach for training verifiable safe control policies in nonlinear dynamical systems by combining deep reinforcement learning with finite-step reachability verification techniques, achieving significantly improved safety verification horizons compared to existing safe RL methods.

Constrained Trust Region Policy Optimization (C-TRPO): A Safe Reinforcement Learning Algorithm for Constrained Markov Decision Processes

This paper introduces C-TRPO, a novel safe reinforcement learning algorithm that modifies the policy space geometry to ensure constraint satisfaction throughout training, achieving competitive reward maximization with fewer constraint violations compared to existing methods.

Constrained Monte Carlo Tree Search (C-MCTS) for Safe Planning in Constrained Markov Decision Processes

C-MCTS is a novel algorithm that enhances safety in reinforcement learning by pre-training a safety critic to guide Monte Carlo Tree Search, enabling efficient planning and constraint satisfaction in complex environments.

Regret Bounds for Safe Online Reinforcement Learning in the One-Dimensional Linear Quadratic Regulator with Position Constraints

Enforcing safety constraints in online Linear Quadratic Regulator (LQR) learning can lead to faster learning rates, achieving regret bounds comparable to unconstrained settings, even with stronger baselines and various noise distributions.

Simple-to-Complex Knowledge Transfer for Safe and Efficient Reinforcement Learning in Autonomous Driving

This research paper introduces the Simple to Complex Collaborative Decision (S2CD) framework, a novel approach to training reinforcement learning agents for autonomous driving that prioritizes both safety and efficiency by leveraging knowledge transfer from a teacher model trained in a simplified environment.

Improved Regret Bound for Safe Reinforcement Learning via Tighter Cost Pessimism and Reward Optimism (DOPE+ Algorithm)

This research paper introduces DOPE+, a novel algorithm for safe reinforcement learning in constrained Markov decision processes (CMDPs), which achieves an improved regret upper bound while guaranteeing no constraint violation during the learning process.

ACTSAFE: A Safe Model-Based Reinforcement Learning Algorithm for Efficient Exploration with Safety Constraints

ACTSAFE is a novel model-based reinforcement learning algorithm that guarantees safe exploration in continuous action spaces by leveraging epistemic uncertainty for exploration while ensuring safety through pessimism.

Decision-Point Guided Safe Policy Improvement in Batch Reinforcement Learning

Decision Points RL (DPRL) improves the safety and efficiency of batch reinforcement learning by focusing policy improvements on frequently visited state-action pairs (decision points) while deferring to the behavior policy in less explored areas.

A Safety Modulator Actor-Critic (SMAC) Method for Model-Free Safe Reinforcement Learning with Application in UAV Hovering

This paper introduces a novel reinforcement learning method called SMAC (Safety Modulator Actor-Critic) that addresses safety constraints and overestimation issues in model-free settings, demonstrating its effectiveness in a UAV hovering task.

Safe Reinforcement Learning with Disturbance Observer-Based Control Barrier Functions and Residual Model Learning

This research paper introduces a novel safe reinforcement learning framework that combines disturbance observers and residual model learning to enhance the robustness and safety of control policies in environments with internal and external disturbances.

Про нас

Продукти

Ресурси