Khái niệm cốt lõi
Large language models (LLMs) can be effectively integrated into causal discovery workflows to extract meaningful insights from unstructured data by proposing relevant high-level factors and iteratively refining them through feedback from the causal discovery algorithm.
Thống kê
GPT-4 achieved 80% recall, 93% precision, and 85% F1 score in identifying factors forming the Markov Blanket of the target variable in the AppleGastronome benchmark.
GPT-3.5 achieved 73% recall, 100% precision, and 84% F1 score in identifying factors forming the Markov Blanket of the target variable in the AppleGastronome benchmark.
LLaMA2-70B achieved 60% recall, 83% precision, and 69% F1 score in identifying factors forming the Markov Blanket of the target variable in the AppleGastronome benchmark.
Mistral-Medium achieved 93% recall, 100% precision, and 96% F1 score in identifying factors forming the Markov Blanket of the target variable in the AppleGastronome benchmark.
Trích dẫn
"The lack of high-quality high-level variables has been a longstanding impediment to broader real-world applications of CDs or causality-inspired methods."
"Trained from massive observations of the world, LLMs demonstrate impressive capabilities in comprehending unstructured inputs, and leveraging the learned rich knowledge to resolve a variety of general tasks."
"To the best of our knowledge, we are the first to leverage LLMs to propose high-level variables, thereby extending the scope of CDs to unstructured data."