Efficient Dataset Distillation using Attention Mixer (ATOM) for Improved Performance and Generalization
The ATOM framework efficiently distills large datasets into a smaller synthetic representation by leveraging a mixture of spatial and channel-wise attention, resulting in superior performance and cross-architecture generalization compared to previous dataset distillation methods.