ds.info.update(description ='My first Hub dataset')
Specifying htype and dtype is not required, but it is highly recommended in order to optimize performance, especially for large datasets. Usedtypeto specify the numeric type of tensor data, and usehtypeto specify the underlying data structure. More information on htype can be found here.
Finally, let's populate the data in the tensors.
# Iterate through the files and append to hub dataset
ds.images.append(hub.read(file))# Append to images tensor using hub.read
ds.labels.append(np.uint32(label_num))# Append to labels tensor
ds.images.append(hub.read(path)) is functionally equivalent to ds.image.append(PIL.Image.fromarray(path)). However, the hub.read() method is significantly faster because it does not decompress and recompress the image if the compression matches thesample_compression for that tensor. Further details are available in Understanding Compression.
Check out the first image from this dataset. More details about Accessing Data are available in Step 5.
Congrats! You just created your first dataset! 🎉
Creating Tensor Hierarchies - Coming Soon
Often it's important to create tensors hierarchically, because information between tensors may be inherently coupled—such as bounding boxes and their corresponding labels. Hierarchy can be created using the following lines of code:
# Tensors are accessed via:
For more detailed information regarding accessing datasets and their tensors, check out the next section.