The authors investigate what Artificial Neural Networks (ANNs) actually learn when trained on a set of items - the individual training items themselves or the relations between them. They consider a simple auto-associative task with a small 3-neuron network and analyze both analytical and numerical solutions.
Key insights:
The structure of the auto-associative network reflects the symmetry group of the training set, representing the relations between the items.
Linear auto-associative networks learn the relations between the training items and can generalize to reproduce items outside the training set that are consistent with the learned symmetry. They implement a stable plane attractor in the network dynamics.
Non-linear auto-associative networks, on the other hand, tend to learn the individual training items as stable fixed points. Their generalization ability is more limited, though networks with activation functions containing a linear regime (e.g., tanh) can still partially generalize.
The authors suggest that improving the generalization ability of ANNs requires generating a sufficiently rich repertoire of elementary operations to represent the relations in the training set, rather than just learning the individual items.
Overall, this work provides insights into the fundamental differences between how linear and non-linear ANNs represent and generalize from training data, with implications for the design of more flexible and generalizable neural network architectures.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Renate Kraus... at arxiv.org 04-22-2024
https://arxiv.org/pdf/2404.12401.pdfDeeper Inquiries