Efficient No-regret Learning Algorithms for Convergence to Nash Equilibrium in Potential and Markov Potential Games
The authors propose a variant of the Frank-Wolfe algorithm with sufficient exploration and recursive gradient estimation, which provably converges to the Nash equilibrium while attaining sublinear regret for each individual player in potential games and Markov potential games.