Developing Pareto-optimal estimation and policy learning to identify the most effective treatment that maximizes total reward from short-term and long-term effects.
This paper focuses on developing Pareto-optimal estimation and policy learning to identify the most effective treatment that maximizes total reward from both short-term and long-term effects. The authors introduce a novel algorithm to tackle the challenges of balancing short-term and long-term outcomes in treatment optimization.