Datasets ⭐
dSprites Dataset
Load the dSprites dataset in Python with one line of code. Visualize 737280 images of 64x64 resolution. Stream data while training ML models in PyTorch.
Visualization of the dSprites dataset on the Activeloop Platform

dSprites Dataset

What is dSprites Dataset?

The dSprites (Disentanglement testing Sprites) dataset is created to assess the disentanglement properties of unsupervised learning methods. It can be used to evaluate how well models recover the ground truth latents. The dataset contains 737280 images of 64x64 resolution. The dataset comes along with latents values, and classes.

Downloading dSprites Dataset in Python

Instead of downloading the dSprites in Python, you can effortlessly load it in Python via our open-source package Hub with just one line of code.

Load dSprites Dataset in Python

import hub
ds = hub.load('hub://activeloop/dsprites')

dSprites Dataset Structure

Data Fields

  • images: tensor containing black and white images
  • latents_classes: tensor containing index of the latent factor values
  • latents_values: tensor containing values of the latent factors

How to use dSprites Dataset with PyTorch and TensorFlow in Python

Train a model on dSprites dataset with PyTorch in Python

Let's use Hub's built-in PyTorch one-line dataloader to connect the data to the compute:
dataloader = ds.pytorch(num_workers=0, batch_size=4, shuffle=False)

Train a model on dSprites dataset with TensorFlow in Python

dataloader = ds.tensorflow()

Additional Information about dSprites Dataset

dSprites Dataset Description

dSprites Dataset Contributors

Matthey, L., Higgins, I., Hassabis, D., & Lerchner, A.

dSprites Dataset Licensing Information

Hub users may have access to a variety of publicly available datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have a license to use the datasets. It is your responsibility to determine whether you have permission to use the datasets under their license.
If you're a dataset owner and do not want your dataset to be included in this library, please get in touch through a GitHub issue. Thank you for your contribution to the ML community!

dSprites Dataset Citation Information

author = {Loic Matthey and Irina Higgins and Demis Hassabis and Alexander Lerchner},
title = {dSprites: Disentanglement testing Sprites dataset},
howpublished= {https://github.com/deepmind/dsprites-dataset/},
year = "2017",