CPSDBench is designed to evaluate LLMs in text classification, information extraction, question answering, and text generation tasks related to public security. The study highlights the performance of different LLMs across these tasks and identifies challenges faced by models in handling sensitive data, output formatting errors, understanding instructions, and content generation accuracy.
The research emphasizes the importance of balancing model safety with usability, improving output format flexibility, enhancing comprehension abilities, and optimizing content generation accuracy for future advancements in LLM applications within the public security domain.
Ke Bahasa Lain
dari konten sumber
arxiv.org
Wawasan Utama Disaring Dari
by Xin Tong,Bo ... pada arxiv.org 03-05-2024
https://arxiv.org/pdf/2402.07234.pdfPertanyaan yang Lebih Dalam