The authors introduce two novel CNN model families, RetinaNets and EVNets, that incorporate biologically-inspired front-end blocks to improve model robustness to common image corruptions.
The RetinaBlock simulates key features of retinal and lateral geniculate nucleus (LGN) processing, including spatial summation, center-surround antagonism, light adaptation, and contrast normalization. RetinaNets integrate the RetinaBlock with a standard CNN back-end, while EVNets couple the RetinaBlock with the previously proposed VOneBlock (simulating primary visual cortex) before the back-end.
Experiments on the Tiny ImageNet dataset show that both RetinaNets and EVNets exhibit improved robustness to a wide range of common corruptions compared to the base CNN models, with EVNets providing the largest gains. The improvements are observed across different CNN architectures (ResNet18 and VGG16).
The authors find that the RetinaBlock and VOneBlock contribute complementary forms of invariance, leading to cumulative robustness benefits when combined in the EVNet architecture. While the biologically-inspired front-ends slightly decrease clean image accuracy, the overall robustness improvements demonstrate the value of incorporating early visual processing mechanisms into deep learning models.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Lucas Piper,... at arxiv.org 09-26-2024
https://arxiv.org/pdf/2409.16838.pdfDeeper Inquiries