GvT: A Graph-based Vision Transformer with Talking-Heads Attention for Small Dataset Training
The proposed Graph-based Vision Transformer (GvT) utilizes graph convolutional projection and talking-heads attention to effectively train on small datasets, outperforming convolutional neural networks and other vision transformer variants.