Structured Packing Improves Long Context Utilization in Large Language Model Training
Structuring training data by collating mutually relevant documents into a single training context is an effective strategy for optimizing long context utilization in large language models.