The paper studies discrete-time finite-horizon optimal control problems in probability spaces, where the state of the system is a probability measure. The authors show that the solution of such problems can be obtained by combining two ingredients: (1) the solution of dynamic programming in the "ground space" (i.e., the space on which the probability measures live) and (2) the solution of an optimal transport problem.
The key insights are:
The cost-to-go in the probability space is equal to the multi-marginal optimal transport discrepancy, with the transportation cost being the cost-to-go in the ground space.
The optimal state-input distribution results from the combination of the optimal input in the ground space and the solution of the optimal transport problem.
This separation principle reveals that the "low-level control of the agents of the fleet" (how does one reach the destination?) and the "fleet-level control" (who goes where?) are decoupled. The authors provide several examples and counterexamples to illustrate their results and expose potential pitfalls.
The proofs rely on novel stability results for the (multi-marginal) optimal transport problem, which are of independent interest.
Na inny język
z treści źródłowej
arxiv.org
Głębsze pytania