Datasets:
The dataset viewer is not available for this split.
Error code: StreamingRowsError Exception: FileNotFoundError Message: https://datashare.is.ed.ac.uk/bitstream/handle/10283/3443/VCTK-Corpus-0.92.zip Traceback: Traceback (most recent call last): File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 980, in _wrap_create_connection return await self._loop.create_connection(*args, **kwargs) # type: ignore[return-value] # noqa File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1065, in create_connection raise exceptions[0] File "/usr/local/lib/python3.9/asyncio/base_events.py", line 1050, in create_connection sock = await self._connect_sock( File "/usr/local/lib/python3.9/asyncio/base_events.py", line 961, in _connect_sock await self.sock_connect(sock, address) File "/usr/local/lib/python3.9/asyncio/selector_events.py", line 500, in sock_connect return await fut File "/usr/local/lib/python3.9/asyncio/selector_events.py", line 535, in _sock_connect_cb raise OSError(err, f'Connect call failed {address}') TimeoutError: [Errno 110] Connect call failed ('10.70.21.37', 443) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 417, in _info await _file_info( File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 833, in _file_info r = await session.get(url, allow_redirects=ar, **kwargs) File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/client.py", line 536, in _request conn = await self._connector.connect( File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 540, in connect proto = await self._create_connection(req, traces, timeout) File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 901, in _create_connection _, proto = await self._create_direct_connection(req, traces, timeout) File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 1209, in _create_direct_connection raise last_exc File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 1178, in _create_direct_connection transp, proto = await self._wrap_create_connection( File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/connector.py", line 988, in _wrap_create_connection raise client_error(req.connection_key, exc) from exc aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host datashare.ed.ac.uk:443 ssl:default [Connect call failed ('10.70.21.37', 443)] The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/src/services/worker/src/worker/utils.py", line 263, in get_rows_or_raise return get_rows( File "/src/services/worker/src/worker/utils.py", line 204, in decorator return func(*args, **kwargs) File "/src/services/worker/src/worker/utils.py", line 241, in get_rows rows_plus_one = list(itertools.islice(ds, rows_max_number + 1)) File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1353, in __iter__ for key, example in ex_iterable: File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 207, in __iter__ yield from self.generate_examples_fn(**self.kwargs) File "/tmp/modules-cache/datasets_modules/datasets/vctk/eeb0c5a93221dfd9ef03140e994b3b762d474be37ade9b3cc9e24e07ed227b07/vctk.py", line 92, in _generate_examples with open(meta_path, encoding="utf-8") as meta_file: File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/streaming.py", line 74, in wrapper return function(*args, download_config=download_config, **kwargs) File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/download/streaming_download_manager.py", line 496, in xopen file_obj = fsspec.open(file, mode=mode, *args, **kwargs).open() File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 439, in open return open_files( File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 282, in open_files fs, fs_token, paths = get_fs_token_paths( File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 606, in get_fs_token_paths fs = filesystem(protocol, **inkwargs) File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/registry.py", line 261, in filesystem return cls(**storage_options) File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/spec.py", line 76, in __call__ obj = super().__call__(*args, **kwargs) File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/zip.py", line 58, in __init__ self.fo = fo.__enter__() # the whole instance is a context File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 102, in __enter__ f = self.fs.open(self.path, mode=mode) File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/spec.py", line 1199, in open f = self._open( File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 356, in _open size = size or self.info(path, **kwargs)["size"] File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 115, in wrapper return sync(self.loop, func, *args, **kwargs) File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 100, in sync raise return_result File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in _runner result[0] = await coro File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 430, in _info raise FileNotFoundError(url) from exc FileNotFoundError: https://datashare.is.ed.ac.uk/bitstream/handle/10283/3443/VCTK-Corpus-0.92.zip
Need help to make the dataset viewer work? Open a discussion for direct support.
Dataset Card for VCTK
Dataset Summary
This CSTR VCTK Corpus includes speech data uttered by 110 English speakers with various accents. Each speaker reads out about 400 sentences, which were selected from a newspaper, the rainbow passage and an elicitation paragraph used for the speech accent archive.
Supported Tasks and Leaderboards
[More Information Needed]
Languages
[More Information Needed]
Dataset Structure
Data Instances
A data point comprises the path to the audio file, called file
and its transcription, called text
.
{
'speaker_id': 'p225',
'text_id': '001',
'text': 'Please call Stella.',
'age': '23',
'gender': 'F',
'accent': 'English',
'region': 'Southern England',
'file': '/datasets/downloads/extracted/8ed7dad05dfffdb552a3699777442af8e8ed11e656feb277f35bf9aea448f49e/wav48_silence_trimmed/p225/p225_001_mic1.flac',
'audio':
{
'path': '/datasets/downloads/extracted/8ed7dad05dfffdb552a3699777442af8e8ed11e656feb277f35bf9aea448f49e/wav48_silence_trimmed/p225/p225_001_mic1.flac',
'array': array([0.00485229, 0.00689697, 0.00619507, ..., 0.00811768, 0.00836182, 0.00854492], dtype=float32),
'sampling_rate': 48000
},
'comment': ''
}
Each audio file is a single-channel FLAC with a sample rate of 48000 Hz.
Data Fields
Each row consists of the following fields:
speaker_id
: Speaker IDaudio
: Audio recordingfile
: Path to audio filetext
: Text transcription of corresponding audiotext_id
: Text IDage
: Speaker's agegender
: Speaker's genderaccent
: Speaker's accentregion
: Speaker's region, if annotation existscomment
: Miscellaneous comments, if any
Data Splits
The dataset has no predefined splits.
Dataset Creation
Curation Rationale
[More Information Needed]
Source Data
Initial Data Collection and Normalization
[More Information Needed]
Who are the source language producers?
[More Information Needed]
Annotations
Annotation process
[More Information Needed]
Who are the annotators?
[More Information Needed]
Personal and Sensitive Information
The dataset consists of people who have donated their voice online. You agree to not attempt to determine the identity of speakers in this dataset.
Considerations for Using the Data
Social Impact of Dataset
[More Information Needed]
Discussion of Biases
[More Information Needed]
Other Known Limitations
[More Information Needed]
Additional Information
Dataset Curators
[More Information Needed]
Licensing Information
Public Domain, Creative Commons Attribution 4.0 International Public License (CC-BY-4.0)
Citation Information
@inproceedings{Veaux2017CSTRVC,
title = {CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit},
author = {Christophe Veaux and Junichi Yamagishi and Kirsten MacDonald},
year = 2017
}
Contributions
Thanks to @jaketae for adding this dataset.
- Downloads last month
- 421