Geometric Dynamics of Signal Propagation Predict Trainability of Transformers
The author explores the dynamics of signal propagation in transformers, revealing phase transitions and necessary conditions for trainability. By analyzing Lyapunov exponents, they predict test loss based on initialization hyperparameters.