Ensuring Unique Solution for Bellman Equation with Two-Discount-Factor Surrogate Rewards for LTL Objectives
The Bellman equation for the two-discount-factor surrogate reward used for LTL objectives may have multiple solutions when one of the discount factors is set to 1, leading to inaccurate policy evaluation. A sufficient condition to ensure the Bellman equation has a unique solution equal to the value function is that the solution for states in rejecting bottom strongly connected components (BSCCs) is set to 0.