Keskeiset käsitteet
LeBenchmark 2.0 introduces a standardized framework for assessing and building SSL-equipped French speech technologies, offering unique perspectives on pre-trained models and evaluation protocols.
Tiivistelmä
Directory:
Introduction
Data Extraction for SSL Models in French Speech Processing
Gathering Datasets for Pre-training SSL Models
Building Pre-trained French SSL Models Collection with Wav2vec 2.0
Benchmarking French Tasks for SSL Models
1. Introduction:
Self-supervised learning (SSL) has revolutionized various domains, including speech processing.
LeBenchmark 2.0 aims to provide a standardized framework for evaluating and developing SSL representations of French speech.
2. Data Extraction for SSL Models in French Speech Processing:
Large-scale corpora up to 14,000 hours used for pre-training models.
Three novel pre-trained models introduced ranging from 26 million to one billion parameters.
3. Gathering Datasets for Pre-training SSL Models:
Various datasets collected, covering diverse accents, emotions, dialogues, and speech types.
Audiocite.net dataset added with over 6,600 hours of read speech.
4. Building Pre-trained French SSL Models Collection with Wav2vec 2.0:
Introduction of three new pre-trained models based on the Extra Large dataset.
Detailed hardware and software environments used for large-scale pre-training.
5. Benchmarking French Tasks for SSL Models:
Evaluation of different scenarios with challenging low-resource and high-resource datasets.
Comparison of LeBenchmark models with XLS-R-xlarge baseline in ASR tasks.
Tilastot
LeBenchmarkモデルは、最大14K時間のデータを使用してトレーニングされました。
新しいPre-trainedモデルには、26百万から10億の学習可能なパラメータが含まれています。