MNIST数据库

MNIST数据库（源自“National Institute of Standards and Technology database”^[1] ）是一个通常用于训练各种数字图像处理系统的大型数据库^[2]^[3]。该数据库通过对来自NIST原始数据库的样本进行修改创建，涵盖手写数字的图像，共包含60,000张训练图像和10,000张测试图像，尺寸为28×28像素。该数据库广泛运用于机器学习领域的训练与测试当中^[4]^[5]。MNIST在其发布时使用支持向量机的错误率为0.8%，但一些研究后来通过使用深度学习技术显著改进了这一成绩。

历史

MNIST数据库通过“重混”（re-mixing）的来自NIST原始数据库的样本创建^[6]。创建者认为，由于NIST的训练数据来自美国人口普查局的员工，而测试数据取自美国高中学生，这样的数据集不适合用来进行研究^[7]。此外，NIST的黑白图像被归一化（英语：Normalization (image processing)）处理，以适应28×28像素的边界框，并进行了抗锯齿（英语：Spatial anti-aliasing）处理，从而引入了灰度级别^[7]。

MNIST数据库包含有60,000张训练图像与10,000张测试图像^[8]。训练集的一半和测试集的一半来自NIST的训练数据集，而训练集的另一半和测试集的另一半则来自NIST的测试数据集^[9]。数据库的原始创建者保留了一些在其上测试的算法方法的列表^[7]。在他们的原始论文中，他们使用支持向量机获得了0.8%的错误率^[10]。然而，原始的MNIST数据库含有至少4个错误标签^[11]。

扩展MNIST（EMNIST）是由NIST开发和发布的一个更新的数据集，作为MNIST的（最终）继任者^[12]^[13]。MNIST仅包含手写数字的图像，而EMNIST包括NIST特别数据库19中的所有图像，该数据库包含大量的手写大写和小写字母以及数字的图像^[14]^[15]。

表现

一些研究通过使用人工神经网络在MNIST数据库中获取了“接近人类的表现”^[16]。原始数据库官方网站上列出的最高错误率为12%，这是使用简单线性分类器且没有预处理时的成绩^[10]^[7]。

在2004年，研究人员使用一种名为“LIRA”的基于罗森布拉特感知器原理的三层神经分类器，在数据库上实现了0.42%的最佳错误率^[17]。

一些研究者使用随机失真的MNIST数据库对人工智能系统进行测试。这些系统通常是人工神经网络系统，所使用的失真方式可能是仿射失真或弹性失真（英语：Elastic deformation）^[7]。在某些情况下，这些系统可以非常成功；其中一个系统在数据库上实现了0.39%的错误率^[18]。

2011年，研究人员报告使用类似的神经网络系统，实现了0.27%的错误率，提升了之前的最佳成绩^[19]。2013年，一种基于DropConnect正则化神经网络的方法声称实现了0.21%的错误率^[20]。2016年，单个卷积神经网络在MNIST上的最佳性能为0.25%的错误率^[21]。截至2018年8月，使用MNIST训练数据、没有数据增强的单个卷积神经网络的最佳性能为0.25%的错误率^[21]^[22]。此外，乌克兰赫梅尔尼茨基的并行计算中心（Parallel Computing Center）使用了仅5个卷积神经网络的集成，在MNIST数据库上表现为0.21%的错误率^[23]^[24]。

参见

机器学习研究数据集列表（英语：List of datasets for machine learning research）
Caltech 101（英语：Caltech 101）
LabelMe（英语：LabelMe）
光学字符识别

参考来源

^ THE MNIST DATABASE of handwritten digits. Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond.
^ Support vector machines speed pattern recognition - Vision Systems Design. Vision Systems Design. [2013-08-17].
^ Gangaputra, Sachin. Handwritten digit database. [2013-08-17].
^ Qiao, Yu. THE MNIST DATABASE of handwritten digits. 2007 [2013-08-18]. （原始内容存档于2018年2月11号）.
^ Platt, John C. Using analytic QP and sparseness to speed training of support vector machines (PDF). Advances in Neural Information Processing Systems. 1999: 557–563 [2013-08-18]. （原始内容 (PDF)存档于2016-03-04）.
^ Grother, Patrick J. NIST Special Database 19 - Handprinted Forms and Characters Database (PDF). National Institute of Standards and Technology.
^ ^7.0 ^7.1 ^7.2 ^7.3 ^7.4 LeCun, Yann; Cortez, Corinna; Burges, Christopher C.J. The MNIST Handwritten Digit Database. Yann LeCun's Website yann.lecun.com. [2020-04-30].
^ Kussul, Ernst; Baidyk, Tatiana. Improved method of handwritten digit recognition tested on MNIST database. Image and Vision Computing. 2004, 22 (12): 971–981. doi:10.1016/j.imavis.2004.03.008.
^ Zhang, Bin; Srihari, Sargur N. Fast k-Nearest Neighbor Classification Using Cluster-Based Trees (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004, 26 (4): 525–528 [2020-04-20]. PMID 15382657. doi:10.1109/TPAMI.2004.1265868. （原始内容 (PDF)存档于2021年7月25号）.
^ ^10.0 ^10.1 LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner. Gradient-Based Learning Applied to Document Recognition (PDF). Proceedings of the IEEE. 1998, 86 (11): 2278–2324 [2013-08-18]. doi:10.1109/5.726791.
^ Muller, Nicolas M.; Markert, Karla. Identifying Mislabeled Instances in Classification Datasets. 2019 International Joint Conference on Neural Networks (IJCNN). IEEE: 1–8. July 2019. ISBN 978-1-7281-1985-4. arXiv:1912.05283 . doi:10.1109/IJCNN.2019.8851920.
^ NIST. The EMNIST Dataset. NIST. 2017-04-04 [2022-04-11].
^ NIST. NIST Special Database 19. NIST. 2010-08-27 [2022-04-11].
^ Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373  [cs.CV].
^ Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373v1  [cs.CV].
^ Cires¸an, Dan; Ueli Meier; Jürgen Schmidhuber. Multi-column deep neural networks for image classification (PDF). 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2012: 3642–3649. CiteSeerX 10.1.1.300.3283 . ISBN 978-1-4673-1228-8. S2CID 2161592. arXiv:1202.2745 . doi:10.1109/CVPR.2012.6248110.
^ Kussul, Ernst; Tatiana Baidyk. Improved method of handwritten digit recognition tested on MNIST database (PDF). Image and Vision Computing. 2004, 22 (12): 971–981 [2013-09-20]. doi:10.1016/j.imavis.2004.03.008. （原始内容 (PDF)存档于2013-09-21）.
^ Ranzato, Marc'Aurelio; Christopher Poultney; Sumit Chopra; Yann LeCun. Efficient Learning of Sparse Representations with an Energy-Based Model (PDF). Advances in Neural Information Processing Systems. 2006, 19: 1137–1144 [2013-09-20].
^ Ciresan, Dan Claudiu; Ueli Meier; Luca Maria Gambardella; Jürgen Schmidhuber. Convolutional neural network committees for handwritten character classification (PDF). 2011 International Conference on Document Analysis and Recognition (ICDAR). 2011: 1135–1139 [2013-09-20]. CiteSeerX 10.1.1.465.2138 . ISBN 978-1-4577-1350-7. S2CID 10122297. doi:10.1109/ICDAR.2011.229. （原始内容 (PDF)存档于2016-02-22）.
^ Wan, Li; Matthew Zeiler; Sixin Zhang; Yann LeCun; Rob Fergus. Regularization of Neural Network using DropConnect. International Conference on Machine Learning(ICML). 2013.
^ ^21.0 ^21.1 SimpleNet. Lets Keep it simple, Using simple architectures to outperform deeper and more complex architectures. 2016 [2020-12-03]. arXiv:1608.06037 .
^ SimpNet. Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet. Github. 2018 [2020-12-03]. arXiv:1802.06205 .
^ Romanuke, Vadim. Parallel Computing Center (Khmelnytskyi, Ukraine) represents an ensemble of 5 convolutional neural networks which performs on MNIST at 0.21 percent error rate.. [2016-11-24].
^ Romanuke, Vadim. Training data expansion and boosting of convolutional neural networks for reducing the MNIST dataset error rate. Research Bulletin of NTUU "Kyiv Polytechnic Institute". 2016, 6 (6): 29–34. doi:10.20535/1810-0546.2016.6.84115 .

延伸阅读

Ciresan, Dan; Meier, Ueli; Schmidhuber, Jürgen. Multi-column deep neural networks for image classification (PDF). 2012 IEEE Conference on Computer Vision and Pattern Recognition. New York, NY: Institute of Electrical and Electronics Engineers. June 2012: 3642–3649 [2013-12-09]. CiteSeerX 10.1.1.300.3283 . ISBN 9781467312264. OCLC 812295155. S2CID 2161592. arXiv:1202.2745 . doi:10.1109/CVPR.2012.6248110.

外部链接

[1] THE MNIST DATABASE of handwritten digits. Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond.

[2] Support vector machines speed pattern recognition - Vision Systems Design. Vision Systems Design. [2013-08-17].

[3] Gangaputra, Sachin. Handwritten digit database. [2013-08-17].

[4] Qiao, Yu. THE MNIST DATABASE of handwritten digits. 2007 [2013-08-18]. （原始内容存档于2018年2月11号）.

[5] Platt, John C. Using analytic QP and sparseness to speed training of support vector machines (PDF). Advances in Neural Information Processing Systems. 1999: 557–563 [2013-08-18]. （原始内容 (PDF)存档于2016-03-04）.

[6] Grother, Patrick J. NIST Special Database 19 - Handprinted Forms and Characters Database (PDF). National Institute of Standards and Technology.

[LeCun-7] 7.0 ^7.1 ^7.2 ^7.3 ^7.4 LeCun, Yann; Cortez, Corinna; Burges, Christopher C.J. The MNIST Handwritten Digit Database. Yann LeCun's Website yann.lecun.com. [2020-04-30].

[8] Kussul, Ernst; Baidyk, Tatiana. Improved method of handwritten digit recognition tested on MNIST database. Image and Vision Computing. 2004, 22 (12): 971–981. doi:10.1016/j.imavis.2004.03.008.

[9] Zhang, Bin; Srihari, Sargur N. Fast k-Nearest Neighbor Classification Using Cluster-Based Trees (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004, 26 (4): 525–528 [2020-04-20]. PMID 15382657. doi:10.1109/TPAMI.2004.1265868. （原始内容 (PDF)存档于2021年7月25号）.

[Gradient-10] 10.0 ^10.1 LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner. Gradient-Based Learning Applied to Document Recognition (PDF). Proceedings of the IEEE. 1998, 86 (11): 2278–2324 [2013-08-18]. doi:10.1109/5.726791.

[11] Muller, Nicolas M.; Markert, Karla. Identifying Mislabeled Instances in Classification Datasets. 2019 International Joint Conference on Neural Networks (IJCNN). IEEE: 1–8. July 2019. ISBN 978-1-7281-1985-4. arXiv:1912.05283 . doi:10.1109/IJCNN.2019.8851920.

[12] NIST. The EMNIST Dataset. NIST. 2017-04-04 [2022-04-11].

[13] NIST. NIST Special Database 19. NIST. 2010-08-27 [2022-04-11].

[14] Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373  [cs.CV].

[15] Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373v1  [cs.CV].

[Multideep-16] Cires¸an, Dan; Ueli Meier; Jürgen Schmidhuber. Multi-column deep neural networks for image classification (PDF). 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2012: 3642–3649. CiteSeerX 10.1.1.300.3283 . ISBN 978-1-4673-1228-8. S2CID 2161592. arXiv:1202.2745 . doi:10.1109/CVPR.2012.6248110.

[17] Kussul, Ernst; Tatiana Baidyk. Improved method of handwritten digit recognition tested on MNIST database (PDF). Image and Vision Computing. 2004, 22 (12): 971–981 [2013-09-20]. doi:10.1016/j.imavis.2004.03.008. （原始内容 (PDF)存档于2013-09-21）.

[18] Ranzato, Marc'Aurelio; Christopher Poultney; Sumit Chopra; Yann LeCun. Efficient Learning of Sparse Representations with an Energy-Based Model (PDF). Advances in Neural Information Processing Systems. 2006, 19: 1137–1144 [2013-09-20].

[19] Ciresan, Dan Claudiu; Ueli Meier; Luca Maria Gambardella; Jürgen Schmidhuber. Convolutional neural network committees for handwritten character classification (PDF). 2011 International Conference on Document Analysis and Recognition (ICDAR). 2011: 1135–1139 [2013-09-20]. CiteSeerX 10.1.1.465.2138 . ISBN 978-1-4577-1350-7. S2CID 10122297. doi:10.1109/ICDAR.2011.229. （原始内容 (PDF)存档于2016-02-22）.

[20] Wan, Li; Matthew Zeiler; Sixin Zhang; Yann LeCun; Rob Fergus. Regularization of Neural Network using DropConnect. International Conference on Machine Learning(ICML). 2013.

[:0-21] 21.0 ^21.1 SimpleNet. Lets Keep it simple, Using simple architectures to outperform deeper and more complex architectures. 2016 [2020-12-03]. arXiv:1608.06037 .

[22] SimpNet. Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet. Github. 2018 [2020-12-03]. arXiv:1802.06205 .

[Romanuke3-23] Romanuke, Vadim. Parallel Computing Center (Khmelnytskyi, Ukraine) represents an ensemble of 5 convolutional neural networks which performs on MNIST at 0.21 percent error rate.. [2016-11-24].

[Romanuke4-24] Romanuke, Vadim. Training data expansion and boosting of convolutional neural networks for reducing the MNIST dataset error rate. Research Bulletin of NTUU "Kyiv Polytechnic Institute". 2016, 6 (6): 29–34. doi:10.20535/1810-0546.2016.6.84115 .

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

查论编标准测试项目
全字母句参考实现健全性测试标准测试图像
人工智能	中文房间图灵测试
电视（检验图）	彩条信号印第安人头检验图测试卡F（英语：Test Card F）飞利浦PM5544
计算机语言	“你好，世界”程序自产生程序特拉百·帕尔多-克努斯算法（英语：Trabb Pardo–Knuth algorithm）编译器递归测试 JAPH
数据压缩	卡尔加里语料库（英语：Calgary corpus）坎特伯雷语料库（英语：Canterbury corpus）
三维计算机图形	康奈尔盒子（英语：Cornell box）斯坦福兔子斯坦福龙（英语：Stanford dragon）犹他茶壶
机器学习	ImageNet MNIST数据库列表（英语：List of datasets for machine learning research）
字体排印学	Hamburgevons（英语：Hamburgevons） Lorem ipsum The quick brown fox jumps over the lazy dog 我能吞下玻璃而不伤身体
其他	EICAR测试文件 GTUBE 哈佛语句（英语：Harvard sentences）莱娜图〈Tom's Diner〉 SMPTE通用片头（英语：film leader）圆圈星座防伪技术振动试验（英语：Shakedown (testing)） Bad_Apple!!