CPSDBench is a specialized evaluation benchmark tailored for the Chinese public security domain. It integrates datasets related to public security from real-world scenarios, assessing LLMs across text classification, information extraction, question answering, and text generation tasks. Innovative evaluation metrics are introduced to quantify LLM efficacy accurately. The study aims to enhance understanding of existing models' performance in addressing public security issues and guide future development of more accurate models.
In un'altra lingua
dal contenuto originale
arxiv.org
Approfondimenti chiave tratti da
by Xin Tong,Bo ... alle arxiv.org 03-05-2024
https://arxiv.org/pdf/2402.07234.pdfDomande più approfondite