Faster Rates for No-Regret Learning in General Games via Cautious Optimism

Published 31 Mar 2025 in cs.GT, cs.LG, and math.OC | (2503.24340v1)

Abstract: We establish the first uncoupled learning algorithm that attains $O(n \log² d \log T)$ per-player regret in multi-player general-sum games, where $n$ is the number of players, $d$ is the number of actions available to each player, and $T$ is the number of repetitions of the game. Our results exponentially improve the dependence on $d$ compared to the $O(n\, d \log T)$ regret attainable by Log-Regularized Lifted Optimistic FTRL [Far+22c], and also reduce the dependence on the number of iterations $T$ from $\log⁴ T$ to $\log T$ compared to Optimistic Hedge, the previously well-studied algorithm with $O(n \log d \log⁴ T)$ regret [DFG21]. Our algorithm is obtained by combining the classic Optimistic Multiplicative Weights Update (OMWU) with an adaptive, non-monotonic learning rate that paces the learning process of the players, making them more cautious when their regret becomes too negative.