v2.5.0
Datasets ⭐
EXAMPLE CODE
CelebA Dataset
Load the CelebA dataset in Python fast. 200K celebrity images with 40 attribute annotations each. Stream CelebA Dataset while training ML models.
Visualization of the CelebA train on the dataset Activeloop Platform

Celeb-A dataset

What is CelebA Dataset?

The CelebFaces Attributes Dataset (CelebA) consists of more than 200K celebrity images with 40 attribute annotations each. The images range from extreme poses to heavily background-cluttered backgrounds. Images cover large pose variations, background clutter, and diverse people, making this dataset great for training and testing models for face detection. It can identify people with brown hair, are smiling, or wearing glasses.

Download CelebA Dataset in Python

Instead of downloading the CelebA dataset in Python, you can effortlessly load it in Python via our open-source package Hub with just one line of code.

Load CelebA Dataset Training Subset in Python

1
import hub
2
ds = hub.load("hub://activeloop/celeb-a-train")
Copied!

Load CelebA Dataset Validation Subset in Python

1
import hub
2
ds = hub.load("hub://activeloop/celeb-a-val")
Copied!

Load CelebA Dataset Testing Subset in Python

1
import hub
2
ds = hub.load("hub://activeloop/celeb-a-test")
Copied!

CelebA Dataset Structure

CelebA Data Fields

  • image: tensor containing the 178×218 image.
  • bbox: tensor containing bounding box of their respective images.
  • keypoints: tensor to identify 63 various keypoints from face
  • clock_shadow: tensor to check cloak shadow.
  • arched_eyebrows: tensor to check arch eyebrows.
  • attractive: tensor to check if attractive or not.
  • bags_under_eyes: tensor to check if bags under eyes.
  • bald: tensor to check if bald or not.
  • bangs: tensor to check if bangs there or not.
  • big_lips: tensor to check if big lips there or not.
  • big_nose: tensor to check if big nose there or not.
  • black_hair: tensor to check the presence of black hair.
  • blond_hair: tensor to check if blond hair or not.
  • blurry: tensor to check if the image is blurred.
  • brown_hair: tensor to check the presence of brown hair.
  • bushy_eyebrows: tensor to check the presence of bushy eyebrows.
  • chubby: tensor to check if chubby or not.
  • double_chin: tensor to check the presence of double chin.
  • eyeglasses: tensor check presence of eyebrows.
  • goatee: tensor to check the presence of goatee for a person.
  • gray_hair: tensor to check the presence of gray hair.
  • heavy_makeup: tensor to check the presence of heavy makeup.
  • high_cheekbones: tensor to check the presence of high cheekbones.
  • male: tensor to check if the person is male.
  • mouth_slightly_open: tensor to check if the mouth is open.
  • mustache: tensor to check the presence of mustache.
  • narrow_eyes: tensor to check narrow eyes or not.
  • no_beard: tensor to check if beard present.
  • oval_face: tensor to check if face is oval.
  • pale_skin: tensor to check if the skin is pale.
  • pointy_nose: tensor to check if the nose is pointy.
  • receding_hairline: tensor to check if hairline is receding.
  • rosy_cheeks: tensor to check if the cheeks are rosy.
  • sideburns: tensor to check presence of sideburns.
  • smiling: tensor to check if person is smiling.
  • straight_hair : tensor to check if the hair is straight.
  • wavy_hair: tensor to check if the hair is wavy.
  • wearing_earrings: tensor to check the presence of earing.
  • wearing_hat: tensor to check the presence of hat.
  • wearing_lipstick: tensor to check the presence of lipstick.
  • wearing_necklace: tensor to check the presence of necklace.
  • wearing_necktie: tensor to check the presence of necktie.
  • young: tensor to check if the person is young.

CelebA Data Splits

How to use CelebA Dataset with PyTorch and TensorFlow in Python

Train a model on CelebA dataset with PyTorch in Python

Let's use Hub's built-in PyTorch one-line dataloader to connect the data to the compute:
1
dataloader = ds.pytorch(num_workers=0, batch_size=4, shuffle=False)
Copied!

Train a model on CelebA dataset with TensorFlow in Python

1
dataloader = ds.tensorflow()
Copied!

Additional Information about CelebA Dataset

CelebA Dataset Description

  • Repository: N/A
  • Paper: Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou: Deep Learning Face Attributes in the Wild, Proceedings of International Conference on Computer Vision (ICCV), 2015
  • Point of Contact: ziwei.liu at ntu.edu.sg

CelebA Dataset Curators

Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou

CelebA Dataset Licensing Information

Hub users may have access to a variety of publicly available datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have a license to use the datasets. It is your responsibility to determine whether you have permission to use the datasets under their license.
If you're a dataset owner and do not want your dataset to be included in this library, please get in touch through a GitHub issue. Thank you for your contribution to the ML community!

CelebA Dataset Citation Information

1
@inproceedings{liu2015faceattributes,
2
title = {Deep Learning Face Attributes in the Wild},
3
author = {Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou},
4
booktitle = {Proceedings of International Conference on Computer Vision (ICCV)},
5
month = {December},
6
year = {2015}
7
}
Copied!

CelebA Dataset FAQs

What is the CelebA dataset for Python?

The CelebFaces Attributes Dataset (CelebA) consists of more than 200K celebrity images with 40 attribute annotations each. The images range from extreme poses to heavily background-cluttered backgrounds.

What is the CelebA dataset used for?

This dataset is great for training and testing models for face detection, particularly for recognizing facial attributes such as finding people with brown hair, are smiling, or wearing glasses. Images cover large pose variations, background clutter, diverse people, supported by a large quantity of images and rich annotations.

How can I use CelebA dataset in PyTorch or TensorFlow?

You can stream CelebA dataset while training a model in PyTorch or TensorFlow with one line of code using the open-source package Activeloop Hub in Python. See detailed instructions on how to train a model on CelebA dataset with PyTorch in Python or train a model on CelebA dataset with TensorFlow in Python.