핵심 개념
The Positivity problem, a well-known number-theoretic problem whose decidability status has been open for decades, is polynomial-time reducible to the threshold problems for the optimal values of various quantities in Markov decision processes, including termination probabilities of one-counter MDPs, satisfaction probabilities of energy objectives, conditional and partial expectations, and conditional value-at-risk for accumulated weights.
초록
The paper investigates a series of optimization problems for one-counter Markov decision processes (MDPs) and integer-weighted MDPs with finite state space. These problems include:
Termination probabilities and expected termination times for one-counter MDPs
Satisfaction probabilities of energy objectives
Conditional and partial expectations
Satisfaction probabilities of constraints on the total accumulated weight
Computation of quantiles for the accumulated weight
Conditional value-at-risk for accumulated weights
Although algorithmic results are available for some special instances, the decidability status of the decision versions of these problems is unknown in general.
The paper demonstrates that these optimization problems are inherently mathematically difficult by providing polynomial-time reductions from the Positivity problem for linear recurrence sequences. This problem is a well-known number-theoretic problem whose decidability status has been open for decades. The reductions rely on the construction of MDP-gadgets that encode the initial values and linear recurrence relations of linear recurrence sequences. These gadgets can be adjusted to prove the various Positivity-hardness results.
The key steps are three direct reductions from the Positivity problem to the threshold problems for the maximal termination probability of one-counter MDPs, the maximal partial expectation, and the maximal conditional value-at-risk. Further chains of reductions establish Positivity-hardness for the full series of optimization problems under investigation.