Core Concepts
Leveraging language model-based conversation summarization to enable effective and efficient retrieval of similar dialogues for few-shot dialogue state tracking.
Abstract
The paper proposes a novel approach for conversation retrieval in the context of few-shot dialogue state tracking (DST) using large language models (LLMs).
Key highlights:
- Previous works use raw dialogue context as search keys and queries, and fine-tune a retriever with annotated dialogues. This approach is less suited for scaling to new domains or languages where fine-tuning data is unavailable.
- To address this, the authors handle conversation retrieval based on text summaries of the conversations, generated by an LLM-based conversation summarizer. This enables effective maximum inner product search.
- To avoid the extra inference cost of LLM-based summarization, the authors further distill a lightweight conversation encoder (CONVERSE) that produces query embeddings without decoding summaries.
- Experiments on MultiWOZ datasets with GPT-Neo-2.7B and LLaMA-7B/30B show that the proposed retrieval approach significantly outperforms relevant baselines in few-shot DST settings.
- The distilled CONVERSE model not only improves efficiency, but also achieves better end-to-end performance compared to using explicit query generation.
Stats
There are 9 Indian restaurants in the center.
The user wants to book a taxi to be picked up at a specific location and dropped off at another.
Quotes
"Few-shot dialogue state tracking (DST) with Large Language Models (LLM) relies on an effective and efficient conversation retriever to find similar in-context examples for prompt learning."
"To address this problem, we handle the task of conversation retrieval based on text summaries of the conversations. A LLM-based conversation summarizer is adopted for query and key generation, which enables effective maximum inner product search."
"To avoid the extra inference cost brought by LLM-based conversation summarization, we further distill a light-weight conversation encoder which produces query embeddings without decoding summaries for test conversations."