graviti logo产品公开数据集关于我们
登录
946
0
32
EMNIST
创建来自Hello Dataset
概要
代码
活动

Overview

The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset . Further information on the dataset contents and conversion process can be found in the paper available at here.

Dataset Summary

There are six different splits provided in this dataset. A short summary of the dataset is provided below:

  • EMNIST ByClass: 814,255 characters. 62 unbalanced classes.
  • EMNIST ByMerge: 814,255 characters. 47 unbalanced classes.
  • EMNIST Balanced: 131,600 characters. 47 balanced classes.
  • EMNIST Letters: 145,600 characters. 26 balanced classes.
  • EMNIST Digits: 280,000 characters. 10 balanced classes.
  • EMNIST MNIST: 70,000 characters. 10 balanced classes.

The full complement of the NIST Special Database 19 is available in the ByClass and ByMerge splits. The EMNIST Balanced dataset contains a set of characters with an equal number of samples per class. The EMNIST Letters dataset merges a balanced set of the uppercase and lowercase letters into a single 26-class task. The EMNIST Digits and EMNIST MNIST dataset provide balanced handwritten digit datasets directly compatible with the original MNIST dataset.

Please refer to the EMNIST paper [PDF, BIB]for further details of the dataset structure.

Data Format

The dataset is provided in two file formats. Both versions of the dataset contain identical information, and are provided entirely for the sake of convenience. The first dataset is provided in a Matlab format that is accessible through both Matlab and Python (using the scipy.io.loadmat function). The second version of the dataset is provided in the same binary format as the original MNIST dataset as outlined in :

http://yann.lecun.com/exdb/mnist/.

Citation

Please use the following citation when referencing the dataset:

@inproceedings{cohen2017emnist,
  title={EMNIST: Extending MNIST to handwritten letters},
  author={Cohen, Gregory and Afshar, Saeed and Tapson, Jonathan and Van Schaik, Andre},
  booktitle={2017 International Joint Conference on Neural Networks (IJCNN)},
  pages={2921--2926},
  year={2017},
  organization={IEEE}
}
🎉感谢Hello Dataset的贡献
数据集信息
应用场景MNIST
标注类型Classification
任务类型暂无
LicenseUnknown
更新时间2021-03-24 22:54:20
数据概要
数据格式Image
数据数量0
已标注数量0
文件大小1MB
版权归属方
Western Sydney University
标注方
未知
了解更多和支持
相关数据集
MultiMNIST
创建来自Robert
MNIST
创建来自AChenQ
立即开始构建AI
免费开始联系我们