IndicSTR12: A Comprehensive Dataset for Indic Scene Text Recognition
Concetti Chiave
IndicSTR12 aims to address the lack of comprehensive datasets for Indian languages by proposing a real dataset and benchmarking STR performance on 12 major Indian languages.
Sintesi
Introduction:
Importance of Scene Text Recognition (STR) in the digital world.
Data-intensive deep learning approaches drive STR solutions.
Dataset Creation:
IndicSTR12 proposed as the largest and most comprehensive real dataset for 12 major Indian languages.
Dataset includes over 27,000 word-images from natural scenes with diverse conditions.
Models Used:
Benchmarking performed on PARSeq, CRNN, and STARNet models.
Experiments:
Models trained on synthetic data and tested on IndicSTR12 dataset.
Multi-lingual training demonstrated improved performance for individual languages.
Conclusion:
IndicSTR12 provides a valuable resource for developing robust text detection and recognition models in Indian languages.
Latin STRモデルは通常英語などのラテン系言語に適用されており、これらの言語では高い精度を達成しています。しかし、インド諸国で話されるような非ラテン系文字や構造的に複雑な言語に対して同等のパフォーマンスを実現することは困難です。インド諸国で使用される文字や文法構造は異なっており、それらに適応した専用モデルやトレーニングアプローチが必要です。