Conceitos essenciais
IndicSTR12 aims to address the lack of comprehensive datasets for Indian languages by proposing a real dataset and benchmarking STR performance on 12 major Indian languages.
Resumo
Introduction:
Importance of Scene Text Recognition (STR) in the digital world.
Data-intensive deep learning approaches drive STR solutions.
Dataset Creation:
IndicSTR12 proposed as the largest and most comprehensive real dataset for 12 major Indian languages.
Dataset includes over 27,000 word-images from natural scenes with diverse conditions.
Models Used:
Benchmarking performed on PARSeq, CRNN, and STARNet models.
Experiments:
Models trained on synthetic data and tested on IndicSTR12 dataset.
Multi-lingual training demonstrated improved performance for individual languages.
Conclusion:
IndicSTR12 provides a valuable resource for developing robust text detection and recognition models in Indian languages.
Estatísticas
インドの言語を話す13億人によって話され、読まれる複雑なインドの言語に対する作業が少ない。
データセットには、さまざまな自然シーンから収集された2万7000以上の単語画像が含まれています。
新しいデータセットとともに、PARSeq(Latin SOTA)、CRNN、STARNetの3つのモデルで高性能なベースラインを提供します。