Restoring Corrupted Datasets
Restoring Deep Lake datasets that may be corrupted.
Deliberate of accidental interruption of code may make a Deep Lake dataset or some of its tensors unreadable. At scale, code interruption is more likely to occur, and Deep Lake's version control is the primary tool for recovery.
When manipulating Deep Lake datasets, it is recommended to commit periodically in order to create snapshots of the dataset that can be accessed later. This can be done automatically when creating datasets with
deeplake.compute, or manually using our version control API.
If a dataset becomes corrupted, when loading the dataset, you may see an error like:
DatasetCorruptError: Exception occured (see Traceback). The dataset maybe corrupted. Try using `reset=True` to reset HEAD changes and load the previous commit. This will delete all uncommitted changes on the branch you are trying to load.
To reset the uncommitted corrupted changes,
loadthe dataset with the
reset = Trueflag:
ds = deeplake.load(<dataset_path>, reset = True)
Note: this operation deletes all uncommitted changes.