แนวคิดหลัก
LVLMs augmented with forgery-specific knowledge improve cross-domain performance in multimodal fake news detection.
บทคัดย่อ
FakeNewsGPT4 proposes a novel framework that leverages world knowledge from Large Vision-Language Models (LVLMs) and augments them with forgery-specific knowledge to address the domain shift issue in multimodal fake news detection. The framework involves acquiring two types of forgery-specific knowledge, semantic correlation, and artifact trace, and merging them into LVLMs. It also incorporates candidate answer heuristics and soft prompts to enhance input informativeness. Extensive experiments on the DGM4 dataset demonstrate superior cross-domain performance compared to previous methods.
สถิติ
The proposed FakeNewsGPT4 achieves an average AUC improvement of 25.12% by incorporating both multi-level cross-modal reasoning and dual-branch fine-grained verification modules.
Removing either the multi-level features or any one of the dual-branch features results in a significant decrease in cross-domain performance.
Utilizing both candidate answer heuristics and soft prompts enhances the model's performance significantly.
คำพูด
"Despite being proficient in recognizing common instances, LVLMs lack forgery-specific knowledge, compromising their effectiveness in MFND tasks."
"We pioneer leveraging world knowledge from large vision-language models (LVLMs) to tackle the domain shift issue in multimodal fake news detection."
"Our contributions are summarized as follows: We propose a generalized detector, FakeNewsGPT4."