核心概念
Koopmanモデルを用いた(e)NMPCにおける最適性のエンドツーエンド強化学習手法を提案する。
统计
システム識別やRLトレーニング中に使用される様々なパラメータ値が含まれています。
引用
"End-to-end training of Koopman models for optimal performance in (e)NMPC applications with hard constraints on states."
"Using RL to train dynamic surrogate models promises to combine the aforementioned advantages of model-based policies with the typical advantage of end-to-end learning over SI."
"We show that the end-to-end trained models outperform those trained using system identification in (e)NMPC."