Quickstart

A jump-start guide to using Hub.

Installing Hub

Hub can be installed through pip.

$ pip3 install hub

Fetching Your First Dataset

Let's load MNIST, the hello world dataset of machine learning.

First, instantiate a Dataset by pointing it to the dataset's locations. Datasets hosted on Activeloop Platform are typically identified by the namespace of the organization followed by the dataset name: activeloop/mnist-train.

import hub

dataset_path = 'hub://activeloop/mnist-train'
ds = hub.dataset(dataset_path) # Returns a Hub Dataset but does 
                               # not download data locally.

Reading Samples From a Hub Dataset

Data is not immediately read into memory because Hub operates lazily. You can fetch data by calling the.numpy()method, which reads data into a numpy array.

# Indexing
W = ds.images[0].numpy() # Fetch an image and return a numpy array
X = ds.labels[0].numpy(as_list=True) # Fetch a label and store it as a 
                                     # list of numpy arrays

# Slicing
Y = ds.images[0:100].numpy() # Fetch 100 images and return a numpy array
                             # The method above produces an exception if 
                             # the images are not all the same size

Z = ds.labels[0:100].numpy(as_list=True) # Fetch 100 labels and store 
                                         # them as a list of numpy arrays

Congratulations, you've got Hub working on your local machine🤓

Last updated