Step 1: Hello World

Installing Hub and accessing your first Hub Dataset.

Installing Hub

Hub can be installed through pip.

$ pip3 install hub

Fetching Your First Hub Dataset

Begin by loading in MNIST, the hello world dataset of machine learning.

First, load the Dataset by pointing to its storage location. Datasets hosted on the Activeloop Platform are typically identified by the namespace of the organization followed by the dataset name: activeloop/mnist.

import hub
dataset_path = 'hub://activeloop/mnist-train'
ds = hub.load(dataset_path) # Returns a Hub Dataset but does not download data locally

Reading Samples From a Hub Dataset

Data is not immediately read into memory because Hub operates lazily. You can fetch data by calling the .numpy() method, which reads data into a NumPy array.

# Indexing
W = ds.images[0].numpy() # Fetch an image and return a NumPy array
X = ds.labels[0].numpy(aslist=True) # Fetch a label and store it as a
# list of NumPy arrays
# Slicing
Y = ds.images[0:100].numpy() # Fetch 100 images and return a NumPy array
# The method above produces an exception if
# the images are not all the same size
Z = ds.labels[0:100].numpy(aslist=True) # Fetch 100 labels and store
# them as a list of NumPy arrays

Congratulations, you've got Hub working on your local machine🤓