Efficient Gated Linear Recurrent Neural Networks with Expanded State Size for Improved Language Modeling and Downstream Tasks
HGRN2 introduces a simple outer-product-based state expansion mechanism to significantly increase the recurrent state size of HGRN without introducing additional parameters, leading to improved performance in language modeling, image classification, and long-range tasks.