The paper explores using reinforcement learning to adjust the control gains of a quadcopter controller, comparing adaptive gain scheduling to static gain control. By implementing Proximal Policy Optimization (PPO), the study achieves a significant decrease in tracking error. The research delves into the dynamics of quadcopters, emphasizing the need for quick responses from controllers due to inherent instability. Utilizing RL algorithms, a virtual environment simulates training without risking real drones. The study details the Markov Decision Process environment setup and key components like agents, transitions, action space, state space, and reward functions. The Proximal Policy Optimization method is explained as an efficient learning approach for optimizing policy by calculating gradients. Training setups are described with specific steps and parameters for successful policy optimization. Results showcase improved tracking performance through RL-controlled trajectories compared to traditional controllers, highlighting substantial percentage differences in Integral Squared Error (ISE) and Integral Time Squared Error (ITSE). Future work suggestions include expanding results to 6 degrees of freedom quadcopters and testing on real drones.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Mike Timmerm... at arxiv.org 03-13-2024
https://arxiv.org/pdf/2403.07216.pdfDeeper Inquiries