CPSDBench is a specialized evaluation benchmark tailored for the Chinese public security domain. It integrates datasets related to public security from real-world scenarios, assessing LLMs across text classification, information extraction, question answering, and text generation tasks. Innovative evaluation metrics are introduced to quantify LLM efficacy accurately. The study aims to enhance understanding of existing models' performance in addressing public security issues and guide future development of more accurate models.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Xin Tong,Bo ... at arxiv.org 03-05-2024
https://arxiv.org/pdf/2402.07234.pdfDeeper Inquiries