Efficient Single-Stream Audio Recognition Architecture with Lightweight Design and Fast Inference
The proposed AudioRepInceptionNeXt architecture is a lightweight single-stream CNN design that reduces computational and memory requirements by over 50% compared to state-of-the-art models, while maintaining comparable accuracy and significantly improving inference speed.