The authors introduce novel meta-algorithms for Risk-Sensitive Distributional Reinforcement Learning with static Lipschitz Risk Measures, achieving statistically efficient algorithms with a dependency of √K for the regret upper bound.
The author presents a framework for Risk-Sensitive Reinforcement Learning (RSRL) using Optimized Certainty Equivalents (OCE) that generalizes various risk measures. By reducing the problem to standard RL, two meta-algorithms are proposed: one based on optimism and another on policy optimization.
RS-DisRL introduces efficient algorithms for risk-sensitive reinforcement learning with static Lipschitz risk measures.