Performant Dataloader (Beta)

How to use Deep Lake's new dataloader built and optimized in C++

How to use Deep Lake's performant Dataloader built and optimized in C++

Deep Lake offers an Alpha version of its dataloader that was build and optimized in C++. The new dataloader is 2-3X faster in many applications, but since it is an experimental state, is not as reliable as the pure-python dataloader described here.

Both dataloaders can be used interchangeably, and their syntax varies as shown below

Pure-Python Dataloader

train_loader = ds_train.pytorch(num_workers = 8,
                                transform = transform, 
                                batch_size = 32,
                                tensors=['images', 'labels'],
                                shuffle = True)

C++ Dataloader

The C++ dataloader is currently available only on Linux machines. It also returns image tensors as PIL images, not numpy arrays (like the python dataloader).

from deeplake.experimental import dataloader

train_loader = dataloader(ds)\
                .transform(transform)\
                .batch(32)\
                .shuffle()\
                .pytorch(tensors=['images', 'labels'], num_workers = 8)

Last updated