핵심 개념
RENES leverages reinforcement learning to train a single policy that can modify games with different sizes, and then applies existing solvers to the modified games to obtain better approximations of Nash Equilibrium.
초록
The paper proposes a novel method called REinforcement Nash Equilibrium Solver (RENES) to improve the approximation of Nash Equilibrium (NE) in multi-player general-sum games. The key ideas are:
- Representing the games as α-rank response graphs to handle games of different sizes, and using graph neural networks to process these graphs.
- Using tensor decomposition to make the action space of the modification policy fixed-sized, regardless of the game size.
- Training the modification policy using proximal policy optimization (PPO), where the policy modifies the original game, and the obtained solution from applying existing solvers (e.g., α-rank, correlated equilibrium, fictitious play, projected replicator dynamics) on the modified game is evaluated on the original game.
Extensive experiments on large-scale normal-form games show that RENES can significantly improve the approximation performance of various existing solvers, and the trained modification policy can generalize to unseen games. This is the first work that leverages reinforcement learning to train a single policy for modifying games to boost the performance of different NE solvers.
통계
The paper reports the following key metrics:
NashConv values of the solutions obtained by applying different solvers (α-rank, CE, FP, PRD) on the original games.
Relative improvement in NashConv values after applying RENES to modify the games, compared to the original solvers.
인용구
"RENES can significantly boost the performance of α-rank, i.e., larger than 0.3 over all three seeds, as shown in Figure 4a, and achieve 0.313 and 0.324 on training and testing sets, respectively."
"For the other solvers, RENES can bring notable improvements, i.e., larger than 0.16 for CE and 0.1 for FP over three seeds."