The paper addresses the multi-UAV pursuit-evasion problem, where a group of drones cooperate to capture a fast evader in a confined environment with obstacles. Existing heuristic algorithms often lack expressive coordination strategies and struggle to capture the evader in extreme scenarios, while reinforcement learning (RL) methods face challenges in training for complex 3D scenarios with diverse task settings due to the vast exploration space.
The authors propose a dual curriculum learning framework, named DualCL, to address these challenges. DualCL comprises two main components:
Intrinsic Parameter Curriculum Proposer: This module progressively suggests intrinsic parameters (capture radius and evader speed) from easy to hard to continually improve the capture capability of drones.
External Environment Generator: This component efficiently explores unresolved scenarios and generates appropriate training distributions of external environment parameters (drone/evader positions, obstacle positions and heights) to further enhance the capture performance of the policy across various scenarios.
The simulation experiments show that DualCL significantly outperforms baseline methods, achieving over 90% capture rate and reducing the capture timestep by at least 27.5% in the training scenarios. DualCL also exhibits the best zero-shot generalization ability in unseen environments. The authors further demonstrate the transferability of the pursuit strategy from simulation to real-world environments.
Vers une autre langue
à partir du contenu source
arxiv.org
Questions plus approfondies