Core Concepts
Large language models struggle to generalize to unseen combinations of primitives in semantic parsing tasks, despite their success in other NLP domains. Recent research explores the underlying causes and develops methods to enhance compositional generalization capabilities.
Abstract
This survey examines the challenges faced by large language models (LLMs) in achieving compositional generalization for semantic parsing tasks. It first delves into the definition of compositional generalization and how it is evaluated using benchmark datasets like SCAN, COGS, and CFQ.
The key factors hindering LLM-based semantic parsers from generalizing compositionally are then discussed. These include the inherent inability of vanilla seq2seq models to generalize to unseen structures, the autoregressive decoding step acting as a bottleneck, and the distributional mismatch between the pretraining corpus and the symbolic outputs required for semantic parsing.
The survey then reviews various methods proposed to improve compositional generalization, categorized into data-augmentation approaches and model-based techniques. Data-augmentation methods aim to expose the model to more examples of compositional generalization during training. Model-based approaches incorporate inductive biases through neuro-symbolic architectures or develop novel prompting strategies that decompose the parsing task into simpler sub-steps.
Finally, the survey discusses emerging research trends in this area, highlighting the orthogonal nature of the proposed solutions and the potential for a more foundational rethinking of the seq2seq paradigm for semantic parsing.