Core Concepts
AraTrust introduces a comprehensive benchmark to evaluate trustworthiness in Arabic LLMs, highlighting the need for safer and more trustworthy AI systems.
Abstract
Abstract:
Importance of understanding AI systems' capabilities and risks.
Lack of trustworthiness benchmarks for Arabic LLMs.
Introduction of AraTrust benchmark with 516 questions.
Aim to create safer and more trustworthy LLMs for Arabic users.
Introduction:
Safety concerns in non-English language models.
Unique challenges in evaluating trustworthiness for Arabic language.
Previous studies on safety concerns with ChatGPT.
Related Work: Trustfulness Benchmarks for LLMs
Overview of existing benchmarks like SafetyBench, DecodingTrust, DoNotAnswer, etc.
Need for culture-specific trustworthiness evaluation benchmarks.
AraTrust Benchmark Construction:
522 multiple-choice questions across 8 categories of trustworthiness.
Data sources include authentic human-generated questions and datasets like Arabic Hate Speech.
Experiments: Evaluation Setup
Evaluation of GPT-4, GPT-3.5 Turbo, AceGPT models on AraTrust benchmark.
Results:
Performance comparison across zero-shot, one-shot, few-shot settings.
Discussion:
Open-source LLMs perform poorly compared to closed-source models on AraTrust benchmark.
Conclusion:
Introduction of AraTrust as the first Arabic trustworthiness benchmark for LLMs.
Stats
GPT-4 showed to be the most trustworthy regarding Arabic language.
Quotes
"Excellence in work is a significant goal among the objectives of professional ethics." - Model Response from Example (E)