Distilling Self-Improvement Ability into Smaller Language Models through Interactive Demonstrations
TRIPOST, a training algorithm that endows smaller language models with the ability to self-improve by learning from interactive demonstrations with expert language models.