연속시간 확률제어/정지 문제에 대한 동적계획법 원리를 측정가능 선택 기법을 사용하여 도출하고, 이를 통해 최적 제어/정지 과정의 특성화와 가치함수의 점근해 표현 등을 얻을 수 있다.
This paper proposes a data-driven stochastic predictive control strategy that utilizes distributionally robust conditional value-at-risk constraints and optimizes the feedback gain within the control policy, providing improved safety and flexibility compared to previous approaches.
This paper presents a computational method for solving singular stochastic control problems motivated by queueing theory applications. The method approximates the original singular control problem by a drift control problem, which can be solved efficiently using a recently developed simulation-based approach.
The paper introduces novel classes of risk-aware fixed-time control Lyapunov functions (RA-FxT-CLFs) and risk-aware path-integral control Lyapunov functions (RA-PI-CLFs) to certify that a stochastic, nonlinear system's trajectories reach a goal set within a fixed-time with a specified probability, despite the presence of measurement uncertainty.
Presenting a convergence theorem for stochastic iterations, particularly Q-learning, under general, possibly non-Markovian, stochastic environments.
提案された新しい制御問題「ソフト拘束シュレディンガーブリッジ(SSB)」の理論的導出と解決策に焦点を当てる。
The authors present convergence theorems for Q-learning under non-Markovian environments, discussing implications and applications to various stochastic control problems.