Performant Dataloader
How to use Deep Lake's new dataloader built and optimized in C++
Deep Lake offers an optimized implementation of its dataloader built in C++, which is 1.5-3X faster than the pure-python implementation, and it supports distributed training. The C++ and Python dataloaders can be used interchangeably, and their syntax varies as shown below.
train_loader = ds_train.pytorch(num_workers = 8,
transform = transform,
batch_size = 32,
tensors=['images', 'labels'],
shuffle = True)
The C++ dataloader is installed using
pip install "deeplake[enterprise]"
. Details on all installation options are available here. train_loader = ds.dataloader()\
.transform(transform)\
.batch(32)\
.shuffle(True)\
.pytorch(tensors=['images', 'labels'], num_workers = 8)
Last modified 26d ago