
Visualization of the VCTK dataset in the Deep Lake UI
The VCTK dataset includes speech data spoken by 109 native speakers of English with diverse accents. Every speaker reads out about 400 sentences, most of which were selected from a newspaper plus the Rainbow Passage and an elicitation paragraph that identifies the speaker’s accent. The Rainbow Passage and elicitation paragraph are the same for all speakers. The newspaper texts were taken from The Herald (Glasgow), with permission from Herald & Times Group. Each speaker reads a different set of newspaper sentences, and each set was selected using a greedy algorithm.
Instead of downloading the VCTK dataset in Python, you can effortlessly load it in Python via our Deep Lake open-source with just one line of code.
import deeplake
ds = deeplake.load("hub://activeloop/vctk")
VCTK Data Fields
- audios: tensor containing the audio file in wave format.
- texts: tensor containing text transcript of the audio.
VCTK Data Splits
- The VCTK dataset training set is composed of 16262.
Train a model on the VCTK dataset with PyTorch in Python
Let’s use Deep Lake built-in PyTorch one-line data loader to connect the data to the compute:
dataloader = ds.pytorch(num_workers=0, batch_size=4, shuffle=False)
Train a model on the VCTK dataset with TensorFlow in Python
dataloader = ds.tensorflow()
- Homepage:https://datashare.ed.ac.uk/handle/10283/2950
- Paper: Yamagishi, Junichi and Veaux, Christophe and MacDonald, Kirsten. in CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit
- Point of Contact: N/A
VCTK Dataset Curators
Yamagishi, Junichi and Veaux, Christophe and MacDonald, Kirsten
VCTK Dataset Licensing Information
Deep Lake users may have access to a variety of publicly available datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have a license to use the datasets. It is your responsibility to determine whether you have permission to use the datasets under their license.
If you’re a dataset owner and do not want your dataset to be included in this library, please get in touch through a GitHub issue. Thank you for your contribution to the ML community!
VCTK Dataset Citation Information
@misc{yamagishi2019vctk,
author={Yamagishi, Junichi and Veaux, Christophe and MacDonald, Kirsten},
title={ {CSTR VCTK Corpus}: English Multi-speaker Corpus for {CSTR} Voice Cloning Toolkit (version 0.92)},
publisher={University of Edinburgh. The Centre for Speech Technology Research (CSTR)},
year=2019,
doi={10.7488/ds/2645},
}
What is the VCTK dataset for Python?
The VCTK dataset is an audio dataset. The dataset was created to build HMM-based text-to-speech synthesis systems, especially for speaker-adaptive HMM-based speech synthesis using average voice models trained on multiple speakers and speaker adaptation technologies.
How to download the VCTK dataset in Python?
You can load the VCTK dataset fast with one line of code using the open-source package Activeloop Deep Lake in Python. See detailed instructions on how to load the VCTK dataset training subset in Python.
How can I use the VCTK dataset in PyTorch or TensorFlow?
You can stream the VCTK dataset while training a model in PyTorch or TensorFlow with one line of code using the open-source package Activeloop Deep Lake in Python. See detailed instructions on how to train a model on the VCTK dataset with PyTorch in Python or train a model on the VCTK dataset with TensorFlow in Python.