Hub can be installed through pip.
$ pip3 install hub
Let's load MNIST, the hello world dataset of machine learning.
First, instantiate a
Dataset by pointing it to the dataset's locations. Datasets hosted on Activeloop Platform are typically identified by the namespace of the organization followed by the dataset name:
from hub import Datasetdataset_path = 'hub://activeloop/mnist-train'ds = Dataset(dataset_path) # Returns a Hub Dataset but does not download data locally.
Data is not immediately read into memory because Hub operates lazily. You can fetch data by calling the
.numpy()method, which reads data into a numpy array.
# IndexingW = ds.images.numpy() # Fetch an image and return a numpy arrayX = ds.labels.numpy(as_list=True) # Fetch a label and store it as a# list of numpy arrays# SlicingY = ds.images[0:100].numpy() # Fetch 100 images and return a numpy array# The method above produces an exception if# the images are not all the same sizeZ = ds.labels[0:100].numpy(as_list=True) # Fetch 100 labels and store# them as a list of numpy arrays
Congratulations, you've got Hub working on your local machine🤓