Alapfogalmak
Introducing SimA, a Softmax-free attention block for vision transformers, simplifying computation and achieving on-par results with SOTA models.
Statisztikák
"Softmax consumes more time compared to any other components including query (Q), key (K), value (V ) operation (Softmax: 453 µs , QKV projections: 333 µs, QKT : 189 µs)."
"Our method is numerically more stable so we use half-precision floating point without overflowing."
"SimA achieves on-par results with SOTA models on various benchmarks."
Idézetek
"Changing Multi-head attention to Single-head one or changing GELU activation function to ReLU has a very small effect on the accuracy of SimA."
"Removing the cost of exp(.) operation can have a large impact particularly in edge devices with limited resources."