แนวคิดหลัก
Jointly fine-tuning a high-level planner with a low-level language model, using a novel soft-selection method for action embeddings, improves language modeling performance, particularly perplexity.
สถิติ
Our best setting improved perplexity by 0.3 (GPT-2) and 0.08 (OLMo) respectively over the baseline.
The perplexity improvement is around 5% when using planner-predicted actions during training.