This paper introduces a novel method for automatically constructing large-scale, high-quality datasets of synthetic query-candidate pairs to enhance the performance of legal case retrieval (LCR) systems, particularly in asymmetric retrieval scenarios where user queries are short and concise.
The authors propose an approach to enhance the understanding of case relevance in legal case retrieval, integrating lexical matching, semantic retrieval, and learning-to-rank techniques, along with heuristic pre-processing and post-processing methods.
Introducing DELTA, a discriminative model for legal case retrieval, focusing on key facts for improved relevance determination.