The paper introduces a method for safe reinforcement learning (RL) that utilizes pre-trained language models (LMs) to handle free-form natural language constraints. Previous safe RL methods with natural language constraints typically adopted recurrent neural networks, which had limited capabilities in dealing with various forms of human language input. Furthermore, these methods often required a ground-truth cost function, necessitating domain expertise to convert language constraints into a well-defined cost function.
To address these issues, the proposed method uses a decoder LM (GPT) to condense the semantic meaning of the natural language constraints and an encoder LM (BERT) to encode the constraints and text-based observations into embeddings. The cosine similarity between the constraint and observation embeddings is then used to predict whether the constraint has been violated, without requiring any ground-truth cost information.
The method is evaluated on two environments: Hazard-World-Grid, a grid-world navigation task, and SafetyGoal, a robot control task. Experiments show that the proposed method can achieve strong performance in terms of both reward and cost, while adhering to the given free-form natural language constraints. Extensive ablation studies demonstrate the efficacy of the decoder LM and encoder LM in the cost prediction module.
إلى لغة أخرى
من محتوى المصدر
arxiv.org
الرؤى الأساسية المستخلصة من
by Xingzhou Lou... في arxiv.org 04-22-2024
https://arxiv.org/pdf/2401.07553.pdfاستفسارات أعمق