生物学的進化に着想を得て、視覚変換器の合理性を進化アルゴリズムとの類推により説明し、効果的なEA変種からヒントを得て、新しいピラミッド型EATFormerアーキテクチャを提案した。


coremsg

視覚変換器の進化アルゴリズムに着想を得た改善

title_rewrite


The authors propose a novel pyramid EA-inspired Vision Transformer (EATFormer) that achieves state-of-the-art performance on various computer vision tasks. The key innovations include an EA-based Transformer (EAT) block, a Global and Local Interaction (GLI) module, a Multi-Scale Region Aggregation (MSRA) module, a Modulated Deformable MSA (MD-MSA), and a Task-Related Head (TRH).


improving-vision-transformer-by-evolutionary-algorithm-inspired-techniques


Improving Vision Transformer by Evolutionary Algorithm-Inspired Techniques