Links

Performant Dataloader

How to use Deep Lake's new dataloader built and optimized in C++

How to use Deep Lake's performant Dataloader built and optimized in C++

Deep Lake offers an optimized implementation of its dataloader built in C++, which is 1.5-3X faster than the pure-python implementation, and it supports distributed training. The C++ and Python dataloaders can be used interchangeably, and their syntax varies as shown below.

Pure-Python Dataloader

train_loader = ds_train.pytorch(num_workers = 8,
transform = transform,
batch_size = 32,
tensors=['images', 'labels'],
shuffle = True)

C++ Dataloader

The C++ dataloader is installed using pip install "deeplake[enterprise]". Details on all installation options are available here.
train_loader = ds.dataloader()\
.transform(transform)\
.batch(32)\
.shuffle(True)\
.pytorch(tensors=['images', 'labels'], num_workers = 8)