Deep neural networks based on linear complex-valued RNNs interleaved with position-wise MLPs are proving to be effective for sequence modeling. The combination allows for precise approximation of regular causal sequence-to-sequence maps. The study explores the benefits of using complex numbers in the recurrence and provides insights into successful reconstruction from hidden states. Results show that even small hidden dimensions can achieve accurate reconstruction, especially when paired with non-linear decoders.
The paper delves into theoretical expressivity results, showcasing how linear RNNs can compress inputs effectively, leading to successful reconstructions. It also discusses the impact of initialization strategies and the role of complex numbers in enhancing memory capabilities. Practical experiments validate the theoretical findings, demonstrating strong performance in learning non-linear sequence-to-sequence mappings.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Antonio Orvi... lúc arxiv.org 03-12-2024
https://arxiv.org/pdf/2307.11888.pdfYêu cầu sâu hơn