核心概念
本文介紹了自動化文本主張驗證 (AVeriTeC) 共享任務,旨在評估自動化系統檢索證據和預測真實世界主張的能力。
統計資料
共有 21 個團隊參與了 AVeriTeC 共享任務。
獲勝團隊 TUDA_MAI 的 AVeriTeC 分數為 63%。
基準系統的 AVeriTeC 分數為 11%。
新測試集包含 1,215 個主張。
知識庫平均每個主張包含 955 個相關文檔。
每個文檔平均包含 6,095 個詞彙。
引述
"The Automated Verification of Textual Claims (AVERITEC) shared task asks participants to retrieve evidence and predict veracity for real-world claims checked by fact-checkers."
"The shared task received 21 submissions, 18 of which surpassed our baseline. The winning team, TUDA_MAI, achieved a score of 63%, a very significant improvement on the 11% achieved by the baseline system."
"Nevertheless, there are still plenty of opportunities for further improvement."