A Benchmark For Benchmarks
Benchmark datasets like MNIST, ImageNet, etc., abound in machine learning. Such datasets stimulate work on a problem by providing an agreed-upon mark to beat. Many of the benchmark datasets, however, are constructed in an ad hoc manner. As a result, it is hard to understand why the best-performing models vary