Основные понятия
RakutenAI-7B introduces advanced Japanese-oriented large language models, achieving top performance in language understanding benchmarks.
Статистика
By improving the tokenization process for Japanese, we can achieve cost efficient text processing during model training as well as inference.
We train the released models on approximately 175 billion tokens of filtered data.
Our instruct model achieves an average score of 68.74, leading by almost 2 points over Youri-7B-instruction, the second best model.
Our model achieves an average score of 60.50, while Japanese-StableLM-Base-Gamma-7b lags by more than 4 points compared to RakutenAI-7B.
Our instruct model achieves an average score of 61.32, leading by almost 5 points over Youri-7B-instruction, the second best model.
Цитаты
"We release our models to the public under the Apache 2.0 License."
"Our aim is to help the community create more affordable and efficient Japanese language models."