Dataset Visualization
How to visualize Deep Lake datasets
How to visualize machine learning datasets
Deep Lake has a web interface for visualizing, versioning, and querying machine learning datasets. It utilizes the Deep Lake format under-the-hood, and it can be connected to datasets stored in all Deep Lake storage locations.
Visualization can be performed in 3 ways:
In the Deep Lake UI (most feature-rich and performant option)
In the python API using
ds.visualize()
In your own application using our integration options.
Requirements for correctly visualizing your own datasets
Deep Lake makes assumptions about underlying data types and relationships between tensors in order to display the data correctly. Understanding the following concepts is necessary in order to use the visualizer:
Visualizer Controls and Modes
Downsampling Data for Faster Visualization
For faster visualization of images and masks, tensors can be downsampled during dataset creation. The downsampled data are stored in the dataset and are automatically rendered by the visualizer depending on the zoom level.
To add downsampling to your tensors, specify the downsampling factor and the number of downsampling layers during tensor creation:
Note: since downsampling requires decompression and recompression of data, it will slow down dataset ingestion.