Concurrent Writes
Concurrent writes in Deep Lake
Last updated
Concurrent writes in Deep Lake
Last updated
Deep Lake offers 3 solutions for concurrently writing data, depending on the required scale of the application. Concurrency is not native to the Deep Lake format, so these solutions use locks and queues to schedule and linearize the write operations to Deep Lake.
Concurrent writes can be supported using an in-memory database that serves as the locking mechanism for Deep Lake datasets. Tools such as Zookeper or Redis are highly performant and reliable and can be deployed using a few lines of code. External locks are recommended for small-to-medium workloads.
COMING SOON. Deep Lake will offer a Managed Tensor Database that supports read (search) and write operations at scale. Deep Lake ensures the operations are performant by provisioning the necessary infrastructure and executing the underlying user requests in a distributed manner. This approach is recommended for production applications that require a separate service to handle the high computational loads of vector search.
Deep Lake datasets internally support file-based locks. File-base locks are generally slower and less reliable that the other listed solutions, and they should only be used for prototyping.