Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization
An offline cooperative multi-agent reinforcement learning algorithm, ComaDICE, that incorporates a stationary distribution shift regularizer to address the distribution shift issue in offline settings, and employs a carefully designed value decomposition strategy to facilitate multi-agent training.