Core Concepts
JPSRO is a novel multi-agent training algorithm that converges to a correlated equilibrium (CE) or coarse correlated equilibrium (CCE) in n-player, general-sum games by using CE and CCE meta-solvers.
Abstract
The paper proposes a novel multi-agent training algorithm called Joint Policy-Space Response Oracles (JPSRO) that can efficiently train agents in n-player, general-sum games. The key insights are:
Correlated equilibrium (CE) and coarse correlated equilibrium (CCE) are suitable solution concepts for n-player, general-sum games as they provide a mechanism for players to coordinate their actions and achieve higher payoffs compared to Nash equilibrium.
The authors introduce a novel solution concept called Maximum Gini (Coarse) Correlated Equilibrium (MG(C)CE) that is computationally tractable, provides a unique solution, and has favorable scaling properties when the solution is full-support.
JPSRO is an iterative algorithm that trains a set of policies for each player and converges to a normal form (C)CE. It uses a (C)CE meta-solver to determine the joint policy distribution at each iteration.
The authors prove that JPSRO(CCE) converges to a CCE and JPSRO(CE) converges to a CE under their respective best response operators.
Empirical results on various games, including pure competition, pure cooperation, and general-sum games, demonstrate the effectiveness of using (C)CE meta-solvers in JPSRO compared to other meta-solvers like uniform, α-Rank, and projected replicator dynamics.
Stats
The paper does not contain any explicit numerical data or statistics. It focuses on the theoretical properties of the proposed algorithms and solution concepts.
Quotes
"CEs provide a richer set of solutions than NEs. The maximum sum of social welfare in CEs is at least that of any NE."
"MG(C)CE provides a unique solution to the equilibrium solution problem and always exists."
"JPSRO(CCE) converges to a CCE and JPSRO(CE) converges to a CE under their respective best response operators."