UniTable: A Unified Framework for Table Structure Recognition via Self-Supervised Pretraining
Core Concepts
UniTable unifies training paradigm and objectives for improved table structure recognition.
Abstract
UniTable introduces a framework that combines pixel-level inputs with self-supervised pretraining to enhance table structure recognition. By unifying the training objectives of extracting table structure, cell content, and cell bounding box into a language modeling task, UniTable achieves state-of-the-art performance on various datasets. The framework is open-source, promoting reproducible research and transparency in the field. Extensive analyses demonstrate the effectiveness of UniTable across different tasks in table structure recognition.
UniTable
Stats
UniTable achieves SOTA 99.18% accuracy on SynthTabNet when pretrained on 2M images.
SSP significantly improves model performance by mitigating the drop caused by replacing CNN backbone with linear projection.
UniTable outperforms previous methods on four out of five largest TSR datasets.
Quotes
"UniTable's training paradigm combines simplicity with scalability empowered by self-supervised pretraining."
"Extensive quantitative and qualitative analyses highlight UniTable's state-of-the-art performance."
"We open-source our code to promote reproducible research and transparency in the field."