Bagajo, J., Schwarke, C., Klemm, V., Georgiev, I., Sleiman, J.-P., Tordesillas, J., Garg, A., & Hutter, M. (2024). DiffSim2Real: Deploying Quadrupedal Locomotion Policies Purely Trained in Differentiable Simulation. In CoRL 2024 Workshop 'Differentiable Optimization Everywhere'.
This research aims to demonstrate the feasibility of training quadrupedal locomotion policies entirely within a differentiable simulator and successfully transferring them to a real robot, a task previously deemed challenging due to the limitations of existing contact models in such simulators.
The researchers developed a differentiable simulator incorporating an innovative "analytically smooth contact model" that combines the advantages of hard and soft contact models. This model provides both physical accuracy and informative gradients, crucial for effective policy learning and sim-to-real transfer. They employed the Short-Horizon Actor-Critic (SHAC) algorithm, which leverages the simulator's first-order gradients for enhanced learning efficiency. The team then meticulously adapted the learning setup, including the reward function and inertia model, to facilitate successful policy transfer to the real-world ANYbotics' ANYmal D robot. This involved integrating domain randomization and a simplified actuator model for robust performance.
The study showcases the successful transfer of quadrupedal locomotion policies learned solely within a differentiable simulator to a real robot, marking a significant achievement in robotics. The use of the analytically smooth contact model proved crucial for generating effective and transferable locomotion gaits. Furthermore, the SHAC algorithm demonstrated superior sample efficiency compared to traditional reinforcement learning methods like PPO.
This research establishes that training complex robotic skills like quadrupedal locomotion entirely within differentiable simulators is feasible and can produce policies directly applicable to real-world robots. The development of the analytically smooth contact model is a key enabler for this achievement, paving the way for more efficient and realistic robot learning in simulation.
This work significantly contributes to the field of robotics by demonstrating the potential of differentiable simulators for achieving sim-to-real transfer in challenging locomotion tasks. It highlights the importance of accurate and differentiable contact models for bridging the gap between simulation and reality.
While this study provides a proof-of-concept, the authors acknowledge limitations, including the need for further analysis and optimization of the learning setup. Future research will focus on enhancing the robustness of the learned policies by incorporating rough terrain and exploring the integration of more complex actuator models within the differentiable simulation framework.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Joshua Bagaj... at arxiv.org 11-05-2024
https://arxiv.org/pdf/2411.02189.pdfDeeper Inquiries