Core Concepts
Automatic chart understanding has evolved significantly with the rise of large foundation models, revolutionizing data visualization tasks.
Abstract
Data visualization through charts plays a crucial role in conveying insights and aiding decision-making. Recent advancements in automatic chart understanding have been driven by large foundation models like GPT, enhancing performance across various tasks. This survey paper provides an overview of recent developments, challenges, and future directions in chart understanding within the context of these foundation models. It covers tasks such as chart question answering, captioning, conversion, fact-checking, and error correction. The paper discusses evaluation metrics, modeling strategies, challenges like domain-specific charts, and future directions for research. It also explores related tasks in natural image understanding and document comprehension.
Stats
"Large vision-language foundation models (e.g., GPT-4V [16], LLaVA [17]) have catalyzed unprecedented advancements across various multimedia cognitive tasks."
"The dataset PlotQA [4] includes open-vocabulary questions that require applying aggregation operations on underlying chart data."
"ChartFC [13] and ChartCheck [12] are datasets specifically designed for chart fact-checking."
"RNSS represents each entry of the predicted table using values only to calculate similarity between predicted and ground truth tables."
"CHARTVE formulates factual inconsistency detection as a visual entailment task to predict consistency between charts and captions."
Quotes
"Charts serve as indispensable tools for translating raw data into comprehensible visual narratives."
"Foundation models like generative pre-trained transformers (GPT) have revolutionized various natural language processing (NLP) tasks."
"Automated chart understanding sits at the intersection of opportunity and impact."
"The lack of domain-specific chart understanding datasets indicates great opportunities for future work."
"Pre-training enables models to learn more robust feature representations for accurate interpretation of charts."