Deep neural networks based on linear complex-valued RNNs interleaved with position-wise MLPs are proving to be effective for sequence modeling. The combination allows for precise approximation of regular causal sequence-to-sequence maps. The study explores the benefits of using complex numbers in the recurrence and provides insights into successful reconstruction from hidden states. Results show that even small hidden dimensions can achieve accurate reconstruction, especially when paired with non-linear decoders.
The paper delves into theoretical expressivity results, showcasing how linear RNNs can compress inputs effectively, leading to successful reconstructions. It also discusses the impact of initialization strategies and the role of complex numbers in enhancing memory capabilities. Practical experiments validate the theoretical findings, demonstrating strong performance in learning non-linear sequence-to-sequence mappings.
toiselle kielelle
lähdeaineistosta
arxiv.org
Tärkeimmät oivallukset
by Antonio Orvi... klo arxiv.org 03-12-2024
https://arxiv.org/pdf/2307.11888.pdfSyvällisempiä Kysymyksiä