Główne pojęcia
ParFormer enhances feature extraction in transformers by integrating different token mixers and convolution attention patch embedding.
Statystyki
This work presents ParFormer as an enhanced transformer architecture.
Our comprehensive evaluation demonstrates that our ParFormer outperforms CNN-based and state-of-the-art transformer-based architectures in image classification.
The proposed CAPE has been demonstrated to benefit the overall MetaFormer architecture, resulting in a 0.5% increase in accuracy.