RecurrentGemma: An Efficient Open Language Model Outperforming Transformers
RecurrentGemma, a novel open language model, achieves comparable performance to the Gemma-2B transformer model while offering significantly faster inference, especially on long sequences, by using a fixed-size state and a combination of linear recurrences and local attention.