v2.5.0
Datasets ⭐
EXAMPLE CODE
not-MNIST Dataset
Load the not-MNIST dataset with ten classes of letters A-J selected from various fonts, in Python with one line of code in seconds and plug it in TensorFlow and PyTorch.
Visualization of the not-MNIST Dataset on the Activeloop Platform.

not-MNIST Dataset

What is not-MNIST Dataset?

The not-MNIST dataset comprises of some freely accessible fonts and symbols extracted to create a dataset similar to MNIST. The dataset is divided into two parts: a relatively small hand-cleaned portion of approximately 19k samples and a larger uncleaned portion of 500k samples. There are ten classes, with letters A-J drawn from various fonts.

Download not-MNIST Dataset in Python

Instead of downloading the not-MNIST dataset in Python, you can effortlessly load it in Python via our open-source package Hub with just one line of code.

Load not-MNIST-small Dataset in Python

1
import hub
2
ds = hub.load('hub://activeloop/not-mnist-small')
Copied!

Load not-MNIST-large Dataset in Python

1
import hub
2
ds = hub.load('hub://activeloop/not-mnist-large')
Copied!

not-MNIST Dataset Structure

not-MNIST Data Fields

  • image: tensor containing the 28x28 image.
  • label: tensor containing labels that represent letters from A to J.

How to use not-MNIST Dataset with PyTorch and TensorFlow in Python

Train a model on not-MNIST dataset with PyTorch in Python

Let's use Hub's built-in PyTorch one-line dataloader to connect the data to the compute:
1
dataloader = ds.pytorch(num_workers=0, batch_size=4, shuffle=False)
Copied!

Train a model on not-MNIST dataset with TensorFlow in Python

1
dataloader = ds.tensorflow()
Copied!

Additional Information about not-MNIST Dataset

not-MNIST Dataset Description

not-MNIST Dataset Curators

Yaroslav Bulatov

not-MNIST Dataset Licensing Information

Hub users may have access to a variety of publicly available datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have a license to use the datasets. It is your responsibility to determine whether you have permission to use the datasets under their license. If you're a dataset owner and do not want your dataset to be included in this library, please get in touch through a GitHub issue. Thank you for your contribution to the ML community!

not-MNIST Dataset Citation Information

1
@article{bulatov2011notmnist,
2
title={Notmnist dataset},
3
author={Bulatov, Yaroslav},
4
journal={Google (Books/OCR), Tech. Rep.[Online]. Available: http://yaroslavvb. blogspot. it/2011/09/notmnist-dataset. html},
5
volume={2},
6
year={2011}
7
}
Copied!

not-MNIST Dataset FAQs

What is the not-MNIST dataset for Python?

The not-MNIST dataset is divided into two parts: a relatively small hand-cleaned portion of approximately 19k samples and a larger uncleaned portion of 500k samples.
How to download the not-MNIST dataset in Python?
You can load not-MNIST dataset fast with one line of code using the open-source package Activeloop Hub in Python. See detailed instructions on how to load not-MNIST dataset training subset and testing subset in Python.

How can I use the not-MNIST dataset in PyTorch or TensorFlow?

You can stream the not-MNIST dataset while training a model in PyTorch or TensorFlow with one line of code using the open-source package Activeloop Hub in Python. See detailed instructions on how to train a model on not-MNIST dataset with PyTorch in Python or train a model on not-MNIST dataset with TensorFlow in Python.