Reinforcement Learning with Bottleneck and Other Non-Cumulative Objectives
This paper proposes a modification to existing reinforcement learning algorithms to optimize non-cumulative objectives, such as the bottleneck reward, maximum reward, and harmonic mean reward, which are prevalent in various application domains like communications and networking.