Exploring the Richness Scale: Understanding the Lazy Kernel and Feature-Learning Regimes in Wide Neural Networks
Training behavior of wide neural networks is characterized by a single richness hyperparameter that controls the degree of feature learning, ranging from lazy kernel-like behavior to rich feature-learning behavior.