RiskQ proposes a novel approach for risk-sensitive multi-agent reinforcement learning value factorization, satisfying the RIGM principle for common risk metrics.
RiskQ proposes a novel approach for risk-sensitive multi-agent reinforcement learning value factorization, satisfying the RIGM principle for common risk metrics.
The core message of this paper is that a naive definition of regret in risk-sensitive multi-agent reinforcement learning can lead to equilibrium bias, where the most risk-sensitive agents are favored at the expense of the other agents. The authors propose a new notion of risk-balanced regret to address this issue.