Extending Provably Correct Reinforcement Learning to Continuous Action Spaces
This work extends provably correct reinforcement learning algorithms for low-rank Markov Decision Processes (MDPs) to settings with continuous action spaces, without requiring discretization of the action space.