Adaptive Step-Size Policy Gradient with Polyak Approach for Efficient Reinforcement Learning
The core message of this paper is to introduce an adaptive step-size method for policy gradient in reinforcement learning, inspired by the Polyak step-size concept, which eliminates the need for sensitive step-size tuning and demonstrates faster convergence and more stable policies compared to existing approaches.