Step 6: Connecting Hub Datasets to ML Frameworks
Connecting Hub Datasets to machine learning frameworks such as PyTorch and TensorFlow.
You can connect Hub Datasets to popular ML frameworks such as PyTorch and TensorFlow using minimal boilerplate code, and Hub takes care of the parallel processing!

PyTorch

You can train a model by creating a PyTorch DataLoader from a Hub Dataset using ds.pytorch() .
1
import hub
2
from torch.utils.data import DataLoader
3
4
ds = hub.dataset('./dataset_path') # Hub Dataset
5
dataloader= ds.pytorch(batch_size = 16, num_workers = 2) #PyTorch Dataloader
6
7
for data in dataloader:
8
print(data)
9
# Training Loop
Copied!

TensorFlow - Coming Soon

Similarly, you can convert a Hub Dataset to a TensorFlow Dataset via the tf.Data API.
1
ds # Hub Dataset object, to be used for training
2
ds = ds.tensorflow() # A TensorFlow Dataset
Copied!
Last modified 2mo ago