Core Concepts
SWformer integrates wavelet transform for high-frequency feature learning in SNNs.
Abstract
The content introduces the Spiking Wavelet Transformer (SWformer) as an alternative to self-attention-based token mixers, emphasizing frequency learning. It proposes a Frequency-Aware Token Mixer (FATM) for comprehensive spatial-frequency feature learning. SWformer outperforms vanilla Spiking Transformers in capturing high-frequency visual components, reducing energy consumption and parameters while improving performance on ImageNet datasets. The architecture includes spiking neuron layers, neuromorphic chips, and spiking vision transformers. Experiments demonstrate SWformer's effectiveness on static and neuromorphic datasets.
Introduction:
Spiking Neural Networks (SNNs) mimic biological neurons with binary spikes.
SNNs offer efficiency but lag behind ANNs in accuracy due to global operations.
Incorporating advanced architectures from ANNs enhances SNN performance.
Spiking Wavelet Transformer:
SWformer integrates wavelet transform for spatial-frequency feature learning.
FATM processes input through three branches for comprehensive feature extraction.
Negative spike dynamics enhance frequency representation in SNNs.
Experiment:
SWformer outperforms vanilla Spiking Transformers on ImageNet datasets.
Performance improvements seen on both static and neuromorphic datasets.
Method Analysis:
Visualization shows FATM captures specific frequency information effectively.
Number of splitting blocks impacts resource utilization and performance.
Firing threshold affects power consumption without compromising accuracy.
Stats
Spiking neural networks offer energy-efficient processing by mimicking brain events. SWformer reduces energy consumption by over 50% compared to vanilla Spiking Transformers.
Quotes
"SWformer captures more high-frequency signals than Spiking Transformers."
"Experiments show SWformer's effectiveness in capturing spatial-frequency patterns."