Core Concepts
This paper presents a novel approach to drone navigation by implementing a nonlinear Deep Reinforcement Learning (DRL) agent as a replacement for traditional linear Proportional Integral Derivative (PID) controllers, enabling seamless transition between manual and autonomous modes, and enhancing responsiveness and stability. The integration of a high-precision Vicon indoor tracking system and a 3D A* path planner further demonstrate the potential of Artificial Intelligence in robotic control systems.
Abstract
The paper aims to revolutionize drone flight control by implementing a nonlinear Deep Reinforcement Learning (DRL) agent as a replacement for traditional linear Proportional Integral Derivative (PID) controllers. The primary objective is to enable seamless transition between manual and autonomous modes, enhancing responsiveness and stability.
The key highlights and insights are:
Utilization of the Proximal Policy Optimization (PPO) reinforcement learning strategy within the Gazebo simulator to train the DRL agent.
Integration of a $20,000 indoor Vicon tracking system to provide <1mm positioning accuracy, significantly improving autonomous flight precision.
Development of a 3D A* path planner to navigate the drone in the shortest collision-free trajectory, and successful implementation in real flights.
Extensive testing and prototyping, including manual and autonomous flights, Vicon-integrated flights, and DRL-based control algorithms.
Significant improvement in drone positioning accuracy (50%) and optimization of average speed, overshoot, and settling time using the DRL controller compared to the traditional PID controller.
Challenges faced with the DRL Sim-to-Real problem, leading to minor jittering and overshoot in real flights, which can be addressed through further training and fine-tuning in the real environment.
Implications for future phases, including continuous improvement of control algorithms, deeper exploration of DRL applications, and enhancing the drone's autonomy and technological capabilities.
Stats
The PositionError is defined as the absolute distance between the target and the current location.
The Distance is defined as the absolute distance between the target and the starting position.
The AverageSpeed is calculated as Distance divided by the time it takes for the drone to navigate from the starting point to the target, once the drone has stayed at the target stably for more than 50 timesteps.
The reward function is defined in a way that highly encourages precise navigation to the target and high average speed.
Quotes
"Our main goal for our senior design project has been to strategically integrate deep reinforcement learning (DRL) to transform drone navigation."
"The integration of the Vicon tracking system, a cutting-edge technology that has greatly increased the drone's location accuracy (<0.1mm), was a crucial component of our project."
"Our effort aims to demonstrate the concrete advantages of deep reinforcement learning in navigating intricate settings, going beyond the limitations of conventional control methods."