Q-Learning for Stochastic Control under General Information Structures and Non-Markovian Environments: Convergence Theorems and Applications
The authors present convergence theorems for Q-learning under non-Markovian environments, discussing implications and applications to various stochastic control problems.