CPSDBench is designed to evaluate LLMs in text classification, information extraction, question answering, and text generation tasks related to public security. The study highlights the performance of different LLMs across these tasks and identifies challenges faced by models in handling sensitive data, output formatting errors, understanding instructions, and content generation accuracy.
The research emphasizes the importance of balancing model safety with usability, improving output format flexibility, enhancing comprehension abilities, and optimizing content generation accuracy for future advancements in LLM applications within the public security domain.
A otro idioma
del contenido fuente
arxiv.org
Ideas clave extraídas de
by Xin Tong,Bo ... a las arxiv.org 03-05-2024
https://arxiv.org/pdf/2402.07234.pdfConsultas más profundas