Let's load MNIST, the hello world dataset of machine learning.
First, instantiate a Dataset by pointing it to the dataset's locations. Datasets hosted on Activeloop Platform are typically identified by the namespace of the organization followed by the dataset name: activeloop/mnist-train.
from hub import Dataset
ds = Dataset(dataset_path)# Returns a Hub Dataset but does not download data locally.
Reading Samples From a Hub Dataset
Data is not immediately read into memory because Hub operates lazily. You can fetch data by calling the.numpy()method, which reads data into a numpy array.
W = ds.images.numpy()# Fetch an image and return a numpy array
X = ds.labels.numpy(as_list=True)# Fetch a label and store it as a
# list of numpy arrays
Y = ds.images[0:100].numpy()# Fetch 100 images and return a numpy array
# The method above produces an exception if
# the images are not all the same size
Z = ds.labels[0:100].numpy(as_list=True)# Fetch 100 labels and store
# them as a list of numpy arrays
Congratulations, you've got Hub working on your local machine