The core idea of this work is to leverage the pre-existing coding abilities of large language models (LLMs) to improve in-context learning (ICL) for semantic parsing. The authors make two key changes:
Using general-purpose programming languages (PLs) like Python instead of domain-specific languages (DSLs) as the output representation. This allows LLMs to leverage their existing knowledge of coding practices and standard operations, rather than having to learn a new DSL syntax and semantics from just a few demonstrations.
Augmenting the ICL prompt with a structured Domain Description (DD) that outlines the available classes, methods, and types in the target domain. This provides crucial information that helps the LLM understand the functionality and usage of the output program, which is especially important when only a few demonstrations are available.
The authors evaluate their approach on three semantic parsing datasets - GeoQuery, SMCalFlow, and Overnight - using both ChatGPT and the open-source Starcoder model. They find that prompting the models with Python programs and a DD consistently outperforms prompting with the original DSLs, often by a large margin. This is true even when the DSL prompts are augmented with a DD.
Notably, the Python+DD approach dramatically improves compositional generalization, nearly closing the performance gap between i.i.d. and compositional test splits. Further analysis shows that the prevalence of a PL in pretraining corpora is not the sole factor determining performance - even rare PLs like Scala can outperform more common ones like Python, as long as the PL's syntax and structure resemble general-purpose code.
Overall, the findings suggest that when using LLMs for semantic parsing, it is better to either prompt them with PLs or design DSLs to closely resemble PLs, while also providing a detailed DD. This provides an improved methodology for building semantic parsing applications in the modern context of in-context learning with LLMs.
Til et annet språk
fra kildeinnhold
arxiv.org
Viktige innsikter hentet fra
by Ben Bogin,Sh... klokken arxiv.org 03-29-2024
https://arxiv.org/pdf/2311.09519.pdfDypere Spørsmål