Step 4: Accessing and Updating Data
Learn how Deep Lake Datasets can be accessed or loaded from a variety of storage locations.
How to Access and Load Datasets with Deep Lake
Loading Datasets
Deep Lake Datasets can be loaded from a variety of storage locations using:
Since ds = deeplake.dataset(path)
can be used to both create and load datasets, you may accidentally create a new dataset if there is a typo in the path you provided while intending to load a dataset. If that occurs, simply use ds.delete()
to remove the unintended dataset permanently.
Referencing Tensors
Deep Lake allows you to reference specific tensors using keys or via the "." notation outlined below.
Note: data is still not loaded by these commands.
Accessing Data
Data within the tensors is loaded and accessed using the .numpy()
, .data()
, and .tobytes()
commands. When the underlying data can be converted to a numpy array, .data()
and .numpy()
return equivalent objects.
The .numpy()
method produces an exception if all samples in the requested tensor do not have a uniform shape. If that's the case, running .numpy(aslist=True)
returns a list of NumPy arrays, where the indices of the list correspond to different samples.
Updating Data
Existing data in a Deep Lake dataset can be updated using: