State-space models offer an efficient alternative to transformers for processing long sequences, achieving competitive results with reduced memory usage.
State-space models offer a low-complexity alternative to transformers for encoding long sequences, enabling efficient handling of significantly longer inputs. LOCOST demonstrates competitive performance while being more memory-efficient than state-of-the-art sparse transformers.