CPSDBench is designed to evaluate LLMs in text classification, information extraction, question answering, and text generation tasks related to public security. The study highlights the performance of different LLMs across these tasks and identifies challenges faced by models in handling sensitive data, output formatting errors, understanding instructions, and content generation accuracy.
The research emphasizes the importance of balancing model safety with usability, improving output format flexibility, enhancing comprehension abilities, and optimizing content generation accuracy for future advancements in LLM applications within the public security domain.
เป็นภาษาอื่น
จากเนื้อหาต้นฉบับ
arxiv.org
ข้อมูลเชิงลึกที่สำคัญจาก
by Xin Tong,Bo ... ที่ arxiv.org 03-05-2024
https://arxiv.org/pdf/2402.07234.pdfสอบถามเพิ่มเติม