Core Concepts
This work proposes MF-OML (Mean-Field Occupation-Measure Learning), an online mean-field reinforcement learning algorithm for computing approximate Nash equilibria of large population sequential symmetric games that satisfy the Lasry-Lions monotonicity condition.
Abstract
The paper introduces the problem of finding Nash equilibria in large population multi-agent games, which is challenging due to the complexity of the agent population and strategy spaces. To address this, the authors propose leveraging the mean-field game (MFG) framework, which simplifies the analysis by considering the limit where the number of agents approaches infinity.
The key contributions are:
The authors transform the problem of finding a Nash equilibrium into one of identifying the corresponding occupation measure, which facilitates the use of optimization tools. They introduce the MF-OMI (Mean-Field Occupation-Measure Inclusion) formulation and show that it is a monotone inclusion problem under the Lasry-Lions monotonicity assumption.
They propose the MF-OMI-FBS algorithm, which solves the MF-OMI problem using forward-backward splitting, and establish convergence guarantees.
Building on MF-OMI-FBS, the authors develop the MF-OML algorithm for the online reinforcement learning setting, where the model is unknown. MF-OML achieves high probability regret bounds for computing approximate Nash equilibria, with the bounds depending on the number of episodes and the number of agents.
The paper provides the first fully polynomial multi-agent reinforcement learning algorithm for provably solving Nash equilibria (up to mean-field approximation gaps) beyond variants of zero-sum and potential games.