Learning the training data distribution can significantly improve the accuracy of small interpretable models, making them competitive with specialized techniques.