Główne pojęcia
The author introduces a Recurrent Spiking Transformer framework to efficiently extract spatio-temporal features for visual saliency detection while minimizing power consumption.
Streszczenie
The content delves into the exploration of visual saliency within continuous spike streams using a spike camera. The introduction of the Recurrent Spiking Transformer (RST) framework enables the extraction of spatio-temporal features from binary spike streams, showcasing superior performance compared to other spike neural network-based methods. The study also includes the creation of a comprehensive real-world spike-based visual saliency dataset enriched with diverse lighting conditions. The experiments demonstrate the effectiveness of the RST framework in capturing and highlighting visual saliency in spike streams, paving the way for new perspectives on SNN-based transformers.
Key points:
- Introduction of spike camera emulating fovea principles.
- Proposal of Recurrent Spiking Transformer (RST) framework.
- Creation of real-world spike-based visual saliency dataset.
- Superior performance demonstrated through experiments.
Statystyki
"Our model operates at 5.8 mJ, reducing power usage by a factor of 28.7."
"Dataset comprises 130 sequences divided into 200 subsequences each."
"Extensive experimental results show superiority over other SNN models."
Cytaty
"The raw data captured by the spike camera takes the form of a three-dimensional spike array denoted as D."
"Our model excels in capturing finer details compared to other SNN-based methods."