Alapfogalmak
Enforcing safety constraints in online Linear Quadratic Regulator (LQR) learning can lead to faster learning rates, achieving regret bounds comparable to unconstrained settings, even with stronger baselines and various noise distributions.