Evaluating Large Language Models for Legal Answer Validation in U.S. Civil Procedure
This paper presents two approaches to solving the task of legal answer validation in U.S. civil procedure: fine-tuning pre-trained BERT-based models and performing few-shot prompting on GPT models. The authors found that models trained on domain-specific knowledge perform better, and reformulating the task as a multiple-choice QA problem significantly improves the performance of GPT models.