핵심 개념
Sequence-based pre-training methods can enhance procedural understanding in natural language processing by leveraging the order of steps as a supervision signal.
초록
The paper proposes several novel 'order-as-supervision' pre-training methods to improve procedural text understanding, which is challenging due to the changing attributes of entities in the context. The methods include Permutation Classification, Embedding Regression, and Skip-Clip.
The key highlights are:
- Permutation Classification treats the order of steps as a multi-class classification problem, predicting the index of the permutation.
- Embedding Regression converts the permutation into an embedding vector and performs regression on this embedding, which is equivalent to optimizing ranking metrics.
- Skip-Clip learns representations by ranking target steps based on their proximity to a given context.
The proposed methods are evaluated on two downstream Entity Tracking datasets - NPN-Cooking in the recipe domain and ProPara in the open domain. The results show that the order-based pre-training methods outperform baselines and state-of-the-art language models, with improvements of 1.6% and 7-9% across different metrics.
The paper also analyzes the combination of different pre-training strategies, finding that using a single strategy performs better than sequential combinations, as the strategies use different supervision cues.
통계
The dataset used for pre-training contains over 2.5 million recipes collected from various sources on the internet.
인용구
"Our work is one of the first to introduce and compare several novel 'order-as-supervision' pre-training methods such as Permutation Classification, Skip-Clip, and Embedding Regression to enhance procedural understanding."
"Our proposed methods address the non-trivial Entity Tracking Task that requires prediction of entity states across procedure steps, which requires understanding the order of steps."