The researchers propose a comprehensive pipeline that integrates three distinct deep learning models - DETR for table detection, CascadeTabNet for table structure recognition, and PP OCR v2 for text detection and recognition. This integrated approach effectively handles diverse table styles, complex structures, and image distortions commonly encountered in document images.
The key highlights of the methodology are:
The integrated pipeline demonstrates superior performance compared to existing methods like Table Transformer. It achieves an IOU of 0.96 and an OCR Accuracy of 78%, showcasing a remarkable improvement of approximately 25% in OCR Accuracy.
The proposed approach contributes to the advancement of image-based table recognition techniques, offering a promising solution for handling diverse table layouts in real-world scenarios and enhancing data extraction and comprehension in digitized documents.
A otro idioma
del contenido fuente
arxiv.org
Ideas clave extraídas de
by Avinash Anan... a las arxiv.org 04-17-2024
https://arxiv.org/pdf/2404.10305.pdfConsultas más profundas