Constrained Reinforcement Learning

Inloggen

inzicht - Constrained Reinforcement Learning

C-PG: A Novel Policy Gradient Framework for Constrained Reinforcement Learning with Global Convergence Guarantees

This paper introduces C-PG, a novel policy gradient framework for constrained reinforcement learning that guarantees global last-iterate convergence for both action-based and parameter-based exploration paradigms, addressing limitations of existing methods.

Computing Near-Optimal Deterministic Policies for Constrained Reinforcement Learning with Time-Space Recursive Constraints

This paper presents a novel algorithm, achieving a significant breakthrough by efficiently computing near-optimal deterministic policies for constrained reinforcement learning (CRL) problems with time-space recursive cost criteria.

Sample-Efficient Constrained Reinforcement Learning with General Parameterization: Closing the Gap Between Theory and Practice

This paper introduces PD-ANPG, a novel algorithm for constrained reinforcement learning with general parameterized policies that achieves state-of-the-art sample efficiency, closing the gap between theoretical bounds and practical performance.

Adversarial Constrained Policy Optimization: Improving Constrained Reinforcement Learning by Adapting Budgets (ACPO)

The paper proposes a novel method called Adversarial Constrained Policy Optimization (ACPO) for improving constrained reinforcement learning by dynamically adjusting cost budgets during training, leading to better performance in balancing reward maximization and constraint satisfaction.

A Policy Gradient Reinforcement Learning Algorithm for Finite Horizon Constrained Markov Decision Processes

This paper introduces a novel policy gradient reinforcement learning algorithm specifically designed for finite horizon constrained Markov Decision Processes (CMDPs), demonstrating its superior performance over existing infinite horizon constrained RL algorithms in time-critical scenarios.

Probabilistic Satisfaction of Temporal Logic Constraints in Reinforcement Learning Using Adaptive Policy-Switching

This research paper introduces a novel switching-based reinforcement learning algorithm that guarantees the probabilistic satisfaction of temporal logic constraints throughout the learning process, balancing constraint satisfaction with reward maximization.

Over ons

Producten

Bronnen