The paper explores using reinforcement learning to adjust the control gains of a quadcopter controller, comparing adaptive gain scheduling to static gain control. By implementing Proximal Policy Optimization (PPO), the study achieves a significant decrease in tracking error. The research delves into the dynamics of quadcopters, emphasizing the need for quick responses from controllers due to inherent instability. Utilizing RL algorithms, a virtual environment simulates training without risking real drones. The study details the Markov Decision Process environment setup and key components like agents, transitions, action space, state space, and reward functions. The Proximal Policy Optimization method is explained as an efficient learning approach for optimizing policy by calculating gradients. Training setups are described with specific steps and parameters for successful policy optimization. Results showcase improved tracking performance through RL-controlled trajectories compared to traditional controllers, highlighting substantial percentage differences in Integral Squared Error (ISE) and Integral Time Squared Error (ITSE). Future work suggestions include expanding results to 6 degrees of freedom quadcopters and testing on real drones.
Ke Bahasa Lain
dari konten sumber
arxiv.org
Wawasan Utama Disaring Dari
by Mike Timmerm... pada arxiv.org 03-13-2024
https://arxiv.org/pdf/2403.07216.pdfPertanyaan yang Lebih Dalam