Centrala begrepp
Using large language models for automated relevance judgments in legal case retrieval.
Sammanfattning
The article discusses the challenges in collecting relevant judgments for legal case retrieval and proposes a novel workflow using large language models. It breaks down the annotation process into stages, mimicking human annotators' process. The method shows promising results in obtaining reliable relevance judgments and augmenting legal case retrieval models.
- Introduction: Legal case retrieval importance.
- Challenges: Keyword-based retrieval systems inefficiency.
- Proposed Workflow: Few-shot workflow for relevance judgment.
- Data Annotation: Comparison of LLMs and human experts.
- Key Challenges: Expertise-intensive, lengthy-text, nuance-sensitive.
- Methodology: Preliminary factual analysis, adaptive demo-matching, fact extraction, few-shot annotation.
- Application: Data augmentation for synthetic dataset creation.
- Experiment: Evaluation of annotations and data augmentation.
- Results: Reliability and validity of annotations, impact of different components on annotation quality.
Statistik
Accurately judging the relevance between two legal cases requires a considerable effort to read the lengthy text and a high level of domain expertise.
Large-scale Language Models (LLMs) are designed to understand and generate human-like text with little to no fine-tuning required for specific tasks.
The LeCaRD dataset comprises more than 43,000 candidate cases and 107 query cases.
Citat
"The proposed workflow breaks down the annotation process into a series of stages, imitating the process employed by human annotators."
"Empirical experiments demonstrate that our approach can achieve high consistency with expert annotations."