This work explores the application of Large Language Models (LLMs) as autonomous agents for spacecraft control, focusing on the Kerbal Space Program Differential Games (KSPDG) challenge. The authors developed an LLM-based agent that ranked 2nd in the KSPDG challenge, showcasing the effectiveness of LLMs in solving control problems in the space domain.
The key highlights and insights are:
Limitations of traditional Reinforcement Learning (RL) methods in space applications: RL algorithms often require a large number of simulations and a well-defined reward function, which can be challenging in the space domain due to the scarcity of simulations and the difficulty in defining suitable reward functions.
Prompt engineering and observation augmentation: The authors employed prompt engineering techniques to optimize the performance of the LLM, including providing concise explanations of the Kerbal Space Program (KSP) in the system prompt and giving periodic observations in the user prompt. They also augmented the observation space by providing additional calculated observations, such as relative position, distance to the evader, and direction of the evader, to improve the LLM's reasoning capabilities.
Few-shot prompting: The authors used few-shot prompting to mitigate the LLM's tendency to fail in performing the function call in its first response, which could lead to a negative chain reaction in subsequent responses. By manually writing the first response and appending it to the conversation history, they were able to improve the LLM's reasoning and performance in later responses.
Fine-tuning: The authors fine-tuned the LLM using human gameplay data, which significantly reduced the response latency and eliminated the failure rate observed in the baseline LLM. The fine-tuning process involved adjusting hyperparameters, incorporating a system prompt, and adding more training data.
The authors conclude that the integration of LLMs into critical space missions poses significant challenges, and rigorous testing procedures are crucial to ensure the reliability and safety of LLM-based systems. They propose several directions for future work, including investigating the performance of various LLMs, exploring the use of Large Multimodal Models (LMMs), and studying the scalability of fine-tuning LLMs with larger and more diverse datasets.
To Another Language
from source content
arxiv.org
ข้อมูลเชิงลึกที่สำคัญจาก
by Victor Rodri... ที่ arxiv.org 04-02-2024
https://arxiv.org/pdf/2404.00413.pdfสอบถามเพิ่มเติม