Основные понятия
New approximation and reinforcement learning results for continuous state and action MDPs under average cost criteria.
Аннотация
The content discusses approximation methods, Q-learning algorithms, convergence analysis, and near optimality results for Markov Decision Processes with continuous spaces under average cost criteria. It covers discretization-based approximations, synchronous and asynchronous Q-learning algorithms, convergence to optimal Q values, and the implications of these findings. The paper introduces new approaches and relaxation of continuity conditions compared to prior work.
Introduction
Discusses approximate solutions for MDPs under average cost criteria.
Focuses on problems with continuous state and action spaces.
Literature Review
Highlights various approximation techniques used in MDPs.
Discusses challenges in applying existing techniques to average cost problems with continuous spaces.
Finite Approximations
Quantization of state and action spaces for finite models.
Error bounds for approximations based on weak continuity conditions.
Quantized Q-Learning
Presents synchronous and asynchronous Q-learning algorithms.
Convergence analysis to optimal Q values for finite models constructed via quantization.
Conclusions
Summarizes contributions regarding near optimality of quantized models under average cost criteria.
Статистика
For infinite-horizon average-cost criterion problems, there are few rigorous approximation results.
The paper presents discretization-based approximation methods for fully observed MDPs with continuous spaces.
Synchronous and asynchronous Q-learning algorithms are provided for continuous spaces via quantization.
Цитаты
"There exist relatively few rigorous approximation and reinforcement learning results."
"Our Q-learning convergence results are new for continuous spaces."