Core Concepts
Neural Architecture Search (NAS) methods should be able to find optimal neural network architectures for diverse datasets, not just common benchmarks. This work introduces eight new datasets to challenge NAS approaches and evaluate their generalization capabilities.
Abstract
The authors argue that current NAS methods are often evaluated on a limited set of benchmark datasets, which may lead to overfitting and poor generalization to real-world problems. To address this, they introduce eight new datasets, spanning different types of tasks and complexities, to serve as a more comprehensive benchmark for evaluating NAS approaches.
The datasets include:
- AddNIST: Requires models to learn to add MNIST digits in each color channel.
- Language: Requires models to identify the language of encoded text images.
- MultNIST: Requires models to learn to multiply MNIST digits in each color channel.
- CIFARTile: Requires models to identify the number of unique CIFAR-10 classes in a tiled image.
- Gutenberg: Requires models to identify the author of encoded text snippets.
- Isabella: Requires models to classify musical recordings by era of composition.
- GeoClassing: Requires models to identify the country depicted in satellite imagery.
- Chesseract: Requires models to determine the outcome of a chess game from the final board position.
The authors provide baseline results using various CNN architectures and NAS methods, demonstrating the diverse challenges posed by these datasets. The results highlight the need for NAS approaches that can generalize well beyond common benchmarks to be truly effective in real-world applications.
Stats
The datasets contain the following key metrics:
AddNIST: 70,000 images, 20 classes, 3x28x28 shape
Language: 70,000 images, 10 classes, 1x24x24 shape
MultNIST: 70,000 images, 10 classes, 3x28x28 shape
CIFARTile: 70,000 images, 4 classes, 3x64x64 shape
Gutenberg: 66,000 images, 6 classes, 1x27x18 shape
Isabella: 70,000 images, 4 classes, 1x64x128 shape
GeoClassing: 61,330 images, 10 classes, 3x60x60 shape
Chesseract: 69,996 images, 3 classes, 12x8x8 shape