Small Language Models Can Learn Linguistic Representations from Character-Level Inputs
Small language models trained on character-level inputs can capture linguistic structures at various levels, including syntax, lexicon, and phonetics, performing comparably to or even outperforming larger subword-based models.