Core Concepts
LLM-Personalize, a framework that personalizes large language model (LLM) planners for household robotics tasks by combining imitation learning and iterative self-training, achieves significant improvements in alignment with user preferences compared to existing LLM-based planners.
Abstract
The paper introduces LLM-Personalize, a framework for personalizing large language model (LLM) planners for household robotics tasks. The key components of the framework are:
Context Generator: Maintains and updates an internal representation of the household state, including rooms, receptacles, and objects, based on the robot's local observations. This information is provided as a prompt to the LLM planner.
LLM Planner: An LLM-based module that generates high-level plans as a sequence of actions (e.g., go to object, pick up object, place object on receptacle) in an iterative manner to handle partial observability.
Optimization Pipeline: Combines imitation learning and iterative self-training to personalize the LLM planner to user preferences.
Imitation Learning: Bootstraps the LLM planner to effectively interpret complex input contexts, produce executable plans, and perform initial alignment with example user preferences.
Iterative Self-Training: Allows the LLM planner to further explore and refine its planning strategies based on user preferences collected through interactions.
The authors evaluate LLM-Personalize on the Housekeep benchmark, a challenging simulated real-world 3D environment for household rearrangement tasks. The results show that LLM-Personalize achieves more than a 30% increase in success rate over existing LLM-based planners, demonstrating significantly improved alignment with human preferences.
The authors also conduct ablation studies to analyze the improvements in plan executability, exploration vs. exploitation behavior, and cross-domain generalization of LLM-Personalize.
Stats
"LLM-Personalize achieves more than a 30 percent increase in success rate over existing LLM planners, showcasing significantly improved alignment with human preferences."
Quotes
"Central to our approach is the optimization pipeline, which combines imitation learning and iterative self-training to personalize the LLM planner."
"We show in our experiments that LLM-Personalize outperforms state-of-the-art baseline LLM-based planners with more than a 30 percent increase in success rate, as a result of improved understanding and alignment with human preferences."