Datasets ⭐
Office-Home Dataset
Load the Office-Home Dataset for domain adaptation in Python with one line of code in seconds and plug it in TensorFlow and PyTorch with Activeloop Hub.
Visualization of the office-home dataset on the Activeloop Platform

Office-Home Dataset

What is Office-Home Dataset?

The Office-Home dataset was created to assess deep learning algorithms for domain adaptation-based object recognition. The dataset consists of images from 4 different domains which include art, clip art, product, and Real-World images. The dataset contains images of 65 types of objects commonly found in Office-Home Settings.

Download Office-Home Dataset in Python

Instead of downloading the Office-Home Dataset in Python, you can effortlessly load it in Python via our open-source package Hub with just one line of code.

Load Office-Home Dataset in Python

import hub
ds = hub.load('hub://activeloop/office-home-domain-adaptation')

Office-Home Dataset Structure

Office-Home Data Fields

  • images: tensor containing images
  • domain_objects: labels that represent 65 categories of objects in each domain
  • domain_categories: labels that represent 4 domain categories

How to use Office-Home Dataset with PyTorch and TensorFlow in Python

Train a model on Office-Home Dataset with PyTorch in Python

Let's use Hub's built-in PyTorch one-line dataloader to connect the data to the compute:
dataloader = ds.pytorch(num_workers = 0, batch_size= 4, shuffle = False)

Train a model on Office-Home Dataset with TensorFlow in Python

dataloader = ds.tensorflow()

Office-Home Dataset Creation

Data Collection and Normalization Information

Python crawler was used for image collection. There were 100,000 images of 120 different objects. To make sure that the right objects are present in the image, the dataset was cleaned. It was also ensured that each category has a certain number of images. The last version of the dataset has 15,500 images of 65 different objects.

Additional Information about Office-Home Dataset

Office-Home Dataset Description

Office-Home Dataset Curators

Hemanth Venkateswara, Jose Eusebio, Shayok Chakraborty and Sethuraman Panchanathan

Office-Home Dataset Licensing Information

More information about the license can be found here. Hub users may have access to a variety of publicly available datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have a license to use the datasets. It is your responsibility to determine whether you have permission to use the datasets under their license. If you're a dataset owner and do not want your dataset to be included in this library, please get in touch through a GitHub issue. Thank you for your contribution to the ML community!

Office-Home Dataset Citation Information

title={Deep hashing network for unsupervised domain adaptation},
author={Venkateswara, Hemanth and Eusebio, Jose and Chakraborty, Shayok and Panchanathan, Sethuraman},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
year={2017} }

Office-Home Dataset FAQs

What is the Office-Home dataset for Python?

The Office-Home dataset was developed to assess domain adaptation algorithms for object recognition using deep learning. The dataset is made up of images from four different domains—artistic, product, real-world images, and clip art. A Python web-crawler that crawled through several search engines and online image directories was used to collect the images in the dataset.

What is the Office-Home dataset used for?

The Office-Home dataset is used as a benchmark dataset for domain adaptation. It contains four domains where each domain consists of 65 categories. The four domains include art (a collection of artistic images in the form of sketches), clipart (a collection of clipart images), product (a domain containing images of objects without a background), and real-world images (a domain containing images of objects captured with a regular camera).

How to download the Office-Home dataset in Python?

With the open-source package Activeloop Hub in Python you can load the Office-Home dataset fast with one line of code. See detailed instructions on how to load the Office-Home dataset in Python.

How can I use Office-Home dataset in PyTorch or TensorFlow?

Using the open-source package Activeloop Hub in Python you can stream the Office-Home dataset while training a model in PyTorch or TensorFlow with one line of code. See detailed instructions on how to train a model on Office-Home dataset with PyTorch in Python or train a model on Office-Home dataset with TensorFlow in Python.