Creating Video Datasets

Get started with video datasets using Deep Lake.

How to convert a video dataset to Deep Lake format

This tutorial is also available as a Colab Notebook

Video datasets are becoming increasingly common in Computer Vision applications. This tutorial demonstrates how to convert a simple video classification dataset into Deep Lake format. Uploading videos in Deep Lake is nearly identical as uploading images, aside from minor differences in sample compression that are described below.

Create the Deep Lake Dataset

The first step is to download the small dataset below called running walking.

animals object detection dataset

The dataset has the following folder structure:

data_dir
|_running
    |_video_1.mp4
    |_video_2.mp4
|_walking
    |_video_3.mp4
    |_video_4.mp4

Now that you have the data, let's create a Deep Lake Dataset in the ./running_walking_deeplake folder by running:

import deeplake
from PIL import Image, ImageDraw
import numpy as np
import os

ds = deeplake.empty('./running_walking_deeplake') # Create the dataset locally

Next, let's inspect the folder structure for the source dataset ./running_walking to find the class names and the files that need to be uploaded to the Deep Lake dataset.

# Find the class_names and list of files that need to be uploaded
dataset_folder = './running_walking'

class_names = os.listdir(dataset_folder)

fn_vids = []
for dirpath, dirnames, filenames in os.walk(dataset_folder):
    for filename in filenames:
        fn_vids.append(os.path.join(dirpath, filename))

Finally, let's create the tensors and iterate through all the images in the dataset in order to upload the data in Deep Lake.

with ds:
    ds.create_tensor('videos', htype='video', sample_compression = 'mp4')
    ds.create_tensor('labels', htype='class_label', class_names = class_names)

    for fn_vid in fn_vids:
        label_text = os.path.basename(os.path.dirname(fn_vid))
        label_num = class_names.index(label_text)

        # Append data to tensors
        ds.videos.append(deeplake.read(fn_vid))
        ds.labels.append(np.uint32(label_num))

Inspect the Deep Lake Dataset

Let's check out the first frame in the second sample from this dataset.

video_ind = 1
frame_ind = 0

# Individual frames are loaded lazily
img = Image.fromarray(ds.videos[ind][frame_ind].numpy())
# Load the numberic label and read the class name from ds.labels.info.class_names
ds.labels.info.class_names[ds.labels[ind].numpy()[frame_ind]]
img
You've successfully created a video dataset in Activeloop Deep Lake.

Congrats! You just created a video classification dataset! 🎉

Last updated

Was this helpful?