Core Concepts
The author explores the practical promise of Real-Time Recurrent Learning (RTRL) in neural networks, focusing on its advantages and limitations in realistic settings.
Abstract
The content delves into the comparison between RTRL and backpropagation through time (BPTT) in sequence-processing recurrent neural networks. It highlights the conceptual advantages of RTRL over BPTT, such as online learning and untruncated gradients for sequences of any length. The study focuses on actor-critic methods combining RTRL and policy gradients, testing them in various environments like DMLab-30, ProcGen, and Atari-2600. By using specific neural architectures with element-wise recurrence, the study demonstrates competitive performance with well-known baselines like IMPALA and R2D2. The limitations of RTRL in real-world applications are also discussed, particularly its complexity in multi-layer cases.
Stats
RTRL requires neither caching past activations nor truncating context.
Our system trained on fewer than 1.2 B environmental frames is competitive with or outperforms well-known baselines trained on 10 B frames.
The space complexity for sensitivity matrices is O(N^3).
The time complexity to update the sensitivity matrix/tensor via Eq. 4 is O(N^4).
Quotes
"RTRL offers certain conceptual advantages over BPTT."
"Most recent research on RTRL focuses on introducing approximation methods into computation."
"The main research question revolves around the quality of proposed approximation methods."