Policy Bifurcation in Safe Reinforcement Learning: Understanding the Need for Discontinuous Policies
In some scenarios, feasible policies should be discontinuous or multi-valued to avoid constraint violations, challenging the assumption of continuous policies in safe reinforcement learning.