The choice of tokenizer significantly impacts the downstream performance and training costs of Large Language Models (LLMs).