Datasets:

collabora
/

whisperspeech

Name: whisperspeech
Creator: Collabora
License: https://choosealicense.com/licenses/mit/

Tasks:

Text-to-Speech

Languages: English

License: mit

Dataset card Files Files and versions Community

Dataset Viewer

Go to dataset viewer

Viewer

The dataset viewer is not available for this dataset.

Cannot get the config names for the dataset.

Error code:   ConfigNamesError
Exception:    FileNotFoundError
Message:      Couldn't find a dataset script at /src/services/worker/collabora/whisperspeech/whisperspeech.py or any data file in the same directory. Couldn't find 'collabora/whisperspeech' on the Model Database Hub either: FileNotFoundError: Unable to resolve any data file that matches ['**'] in dataset repository collabora/whisperspeech with any supported extension ['.csv', '.tsv', '.json', '.jsonl', '.parquet', '.arrow', '.txt', '.blp', '.bmp', '.dib', '.bufr', '.cur', '.pcx', '.dcx', '.dds', '.ps', '.eps', '.fit', '.fits', '.fli', '.flc', '.ftc', '.ftu', '.gbr', '.gif', '.grib', '.h5', '.hdf', '.png', '.apng', '.jp2', '.j2k', '.jpc', '.jpf', '.jpx', '.j2c', '.icns', '.ico', '.im', '.iim', '.tif', '.tiff', '.jfif', '.jpe', '.jpg', '.jpeg', '.mpg', '.mpeg', '.msp', '.pcd', '.pxr', '.pbm', '.pgm', '.ppm', '.pnm', '.psd', '.bw', '.rgb', '.rgba', '.sgi', '.ras', '.tga', '.icb', '.vda', '.vst', '.webp', '.wmf', '.emf', '.xbm', '.xpm', '.BLP', '.BMP', '.DIB', '.BUFR', '.CUR', '.PCX', '.DCX', '.DDS', '.PS', '.EPS', '.FIT', '.FITS', '.FLI', '.FLC', '.FTC', '.FTU', '.GBR', '.GIF', '.GRIB', '.H5', '.HDF', '.PNG', '.APNG', '.JP2', '.J2K', '.JPC', '.JPF', '.JPX', '.J2C', '.ICNS', '.ICO', '.IM', '.IIM', '.TIF', '.TIFF', '.JFIF', '.JPE', '.JPG', '.JPEG', '.MPG', '.MPEG', '.MSP', '.PCD', '.PXR', '.PBM', '.PGM', '.PPM', '.PNM', '.PSD', '.BW', '.RGB', '.RGBA', '.SGI', '.RAS', '.TGA', '.ICB', '.VDA', '.VST', '.WEBP', '.WMF', '.EMF', '.XBM', '.XPM', '.aiff', '.au', '.avr', '.caf', '.flac', '.htk', '.svx', '.mat4', '.mat5', '.mpc2k', '.ogg', '.paf', '.pvf', '.raw', '.rf64', '.sd2', '.sds', '.ircam', '.voc', '.w64', '.wav', '.nist', '.wavex', '.wve', '.xi', '.mp3', '.opus', '.AIFF', '.AU', '.AVR', '.CAF', '.FLAC', '.HTK', '.SVX', '.MAT4', '.MAT5', '.MPC2K', '.OGG', '.PAF', '.PVF', '.RAW', '.RF64', '.SD2', '.SDS', '.IRCAM', '.VOC', '.W64', '.WAV', '.NIST', '.WAVEX', '.WVE', '.XI', '.MP3', '.OPUS', '.zip']
Traceback:    Traceback (most recent call last):
                File "/src/services/worker/src/worker/job_runners/dataset/config_names.py", line 56, in compute_config_names_response
                  for config in sorted(get_dataset_config_names(path=dataset, use_auth_token=use_auth_token))
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/inspect.py", line 325, in get_dataset_config_names
                  dataset_module = dataset_module_factory(
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/load.py", line 1227, in dataset_module_factory
                  raise FileNotFoundError(
              FileNotFoundError: Couldn't find a dataset script at /src/services/worker/collabora/whisperspeech/whisperspeech.py or any data file in the same directory. Couldn't find 'collabora/whisperspeech' on the Model Database Hub either: FileNotFoundError: Unable to resolve any data file that matches ['**'] in dataset repository collabora/whisperspeech with any supported extension ['.csv', '.tsv', '.json', '.jsonl', '.parquet', '.arrow', '.txt', '.blp', '.bmp', '.dib', '.bufr', '.cur', '.pcx', '.dcx', '.dds', '.ps', '.eps', '.fit', '.fits', '.fli', '.flc', '.ftc', '.ftu', '.gbr', '.gif', '.grib', '.h5', '.hdf', '.png', '.apng', '.jp2', '.j2k', '.jpc', '.jpf', '.jpx', '.j2c', '.icns', '.ico', '.im', '.iim', '.tif', '.tiff', '.jfif', '.jpe', '.jpg', '.jpeg', '.mpg', '.mpeg', '.msp', '.pcd', '.pxr', '.pbm', '.pgm', '.ppm', '.pnm', '.psd', '.bw', '.rgb', '.rgba', '.sgi', '.ras', '.tga', '.icb', '.vda', '.vst', '.webp', '.wmf', '.emf', '.xbm', '.xpm', '.BLP', '.BMP', '.DIB', '.BUFR', '.CUR', '.PCX', '.DCX', '.DDS', '.PS', '.EPS', '.FIT', '.FITS', '.FLI', '.FLC', '.FTC', '.FTU', '.GBR', '.GIF', '.GRIB', '.H5', '.HDF', '.PNG', '.APNG', '.JP2', '.J2K', '.JPC', '.JPF', '.JPX', '.J2C', '.ICNS', '.ICO', '.IM', '.IIM', '.TIF', '.TIFF', '.JFIF', '.JPE', '.JPG', '.JPEG', '.MPG', '.MPEG', '.MSP', '.PCD', '.PXR', '.PBM', '.PGM', '.PPM', '.PNM', '.PSD', '.BW', '.RGB', '.RGBA', '.SGI', '.RAS', '.TGA', '.ICB', '.VDA', '.VST', '.WEBP', '.WMF', '.EMF', '.XBM', '.XPM', '.aiff', '.au', '.avr', '.caf', '.flac', '.htk', '.svx', '.mat4', '.mat5', '.mpc2k', '.ogg', '.paf', '.pvf', '.raw', '.rf64', '.sd2', '.sds', '.ircam', '.voc', '.w64', '.wav', '.nist', '.wavex', '.wve', '.xi', '.mp3', '.opus', '.AIFF', '.AU', '.AVR', '.CAF', '.FLAC', '.HTK', '.SVX', '.MAT4', '.MAT5', '.MPC2K', '.OGG', '.PAF', '.PVF', '.RAW', '.RF64', '.SD2', '.SDS', '.IRCAM', '.VOC', '.W64', '.WAV', '.NIST', '.WAVEX', '.WVE', '.XI', '.MP3', '.OPUS', '.zip']

Need help to make the dataset viewer work? Open a discussion for direct support.

The WhisperSpeech Dataset

This dataset contains data to train SPEAR TTS-like text-to-speech models that utilized semantic tokens derived from the OpenAI Whisper speech recognition model.

We currently provide semantic and acoustic tokens for the LibriLight and LibriTTS datasets (English only).

Acoustic tokens:

24kHz EnCodec 6kbps (8 quantizers)

Semantic tokens:

Whisper tiny VQ bottleneck trained on a subset of LibriLight

Available LibriLight subsets:

small/medium/large (following the original dataset division but with large excluding the speaker 6454)
a separate ≈1300hr single-speaker subset based on the 6454 speaker from the large subset for training single-speaker TTS models

We plan to add more acoustic tokens from other codecs in the future.

Downloads last month: 4

Edit dataset card Evaluate models Model Database Leaderboard