The content describes a technique for active rejection of multiple independent and time-correlated stochastic disturbances for a nonlinear flexible inverted pendulum with cart (FIPWC) system with uncertain model parameters. The control law is determined through deep reinforcement learning, specifically with a continuous actor-critic variant of deep Q-learning known as Deep Deterministic Policy Gradient (DDPG).
The FIPWC system is modeled as a beam with a tip mass, with flexibility effects modeled as linear springs. Parametric uncertainty is applied to the model through the pendulum's structural stiffness and damping coefficients, as well as the cart's friction damping coefficient. Disturbances are injected into the system through three independent Ornstein-Uhlenbeck stochastic processes.
The DDPG agent learns a policy that maps the system states to control actions to keep the pendulum vertical throughout the simulation while under the influence of the stochastic disturbances. Simulation results are compared to a classical proportional-derivative (PD) control system, demonstrating the superior performance of the deep reinforcement learning approach, particularly in the presence of the cart velocity disturbance.
翻译成其他语言
从原文生成
arxiv.org
更深入的查询