How to Contribute
Hub uses the black python linter. You can have your code auto-formatted by running
pip install black
, then black .
inside the directory you want to format.Hub uses static typing for function arguments/variables for better code readability. Hub has a GitHub action that runs
mypy .
, which runs similar to pytest .
to check for valid static typing. You can refer to mypy documentation for more information.Hub uses pytest for tests. In order to make it easier to contribute, Hub also has a set of custom options defined in conftest.py.
To see a list of Hubs custom pytest options, run this command:
pytest -h | sed -En '/custom options:/,/\[pytest\] ini\-options/p'
.memory_storage
: If--memory-skip
is provided, tests with this fixture will be skipped. Otherwise, the test will run with only aMemoryProvider
.local_storage
: If--local
is not provided, tests with this fixture will be skipped. Otherwise, the test will run with only aLocalProvider
.s3_storage
: If--s3
is not provided, tests with this fixture will be skipped. Otherwise, the test will run with only anS3Provider
.storage
: All tests that use thestorage
fixture will be parametrized with the enabledStorageProvider
s (enabled via options defined below). If--cache-chains
is provided,storage
may also be a cache chain. Cache chains have the same interface asStorageProvider
, but instead of just a single provider, it is multiple chained in a sequence, where the last provider in the chain is considered the actual storage.ds
: The same as thestorage
fixture, but the storages that are parametrized are wrapped with aDataset
.
Each
StorageProvider
/Dataset
that is created for a test via a fixture will automatically have a root created before and destroyed after the test. If you want to keep this data after the test run, you can use the --keep-storage
option.Fixture Examples
Single storage provider fixture
def test_memory(memory_storage):
# Test will skip if `--memory-skip` is provided
memory_storage["key"] = b"1234" # This data will only be stored in memory
def test_local(local_storage):
# Test will skip if `--local` is not provided
memory_storage["key"] = b"1234" # This data will only be stored locally
def test_local(s3_storage):
# Test will skip if `--s3` is not provided
# Test will fail if credentials are not provided
memory_storage["key"] = b"1234" # This data will only be stored in s3
Multiple storage providers/cache chains
from hub.core.tests.common import parametrize_all_storages, parametrize_all_caches, parametrize_all_storages_and_caches
@parametrize_all_storages
def test_storage(storage):
# Storage will be parametrized with all enabled `StorageProvider`s
pass
@parametrize_all_caches
def test_caches(storage):
# Storage will be parametrized with all common caches containing enabled `StorageProvider`s
pass
@parametrize_all_storages_and_caches
def test_storages_and_caches(storage):
# Storage will be parametrized with all enabled `StorageProvider`s and common caches containing enabled `StorageProvider`s
pass
Dataset storage providers/cache chains
from hub.core.tests.common import parametrize_all_dataset_storages, parametrize_all_dataset_storages_and_caches
@parametrize_all_dataset_storages
def test_dataset(ds):
# `ds` will be parametrized with 1 `Dataset` object per enabled `StorageProvider`
pass
@parametrize_all_dataset_storages_and_caches
def test_dataset(ds):
# `ds` will be parametrized with 1 `Dataset` object per enabled `StorageProvider` and all cache chains containing enabled `StorageProvider`s
pass
TODO: benchmarking is subject to change. will update this section once it is better defined.
Hub would not be possible without the work of our community.
Last modified 1yr ago