Testing Stationarity and Change Point Detection in Reinforcement Learning
The authors develop a model-free test to assess the stationarity of the optimal Q-function based on historical data, enabling policy optimization in nonstationary environments.