sshleifer/distilbart-cnn-12-6
Summarization
β’
Updated
β’
843k
β’
168
Error code: StreamingRowsError Exception: FileNotFoundError Message: https://raw.githubusercontent.com/EdinburghNLP/XSum/master/XSum-Dataset/XSum-TRAINING-DEV-TEST-SPLIT-90-5-5.json Traceback: Traceback (most recent call last): File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 417, in _info await _file_info( File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 837, in _file_info r.raise_for_status() File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1005, in raise_for_status raise ClientResponseError( aiohttp.client_exceptions.ClientResponseError: 503, message='first byte timeout', url=URL('https://raw.githubusercontent.com/EdinburghNLP/XSum/master/XSum-Dataset/XSum-TRAINING-DEV-TEST-SPLIT-90-5-5.json') The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/src/services/worker/src/worker/utils.py", line 263, in get_rows_or_raise return get_rows( File "/src/services/worker/src/worker/utils.py", line 204, in decorator return func(*args, **kwargs) File "/src/services/worker/src/worker/utils.py", line 241, in get_rows rows_plus_one = list(itertools.islice(ds, rows_max_number + 1)) File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1353, in __iter__ for key, example in ex_iterable: File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 207, in __iter__ yield from self.generate_examples_fn(**self.kwargs) File "/tmp/modules-cache/datasets_modules/datasets/xsum/082863bf4754ee058a5b6f6525d0cb2b18eadb62c7b370b095d1364050a52b71/xsum.py", line 133, in _generate_examples with open(split_path, "r", encoding="utf-8") as f: File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/streaming.py", line 74, in wrapper return function(*args, download_config=download_config, **kwargs) File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/download/streaming_download_manager.py", line 496, in xopen file_obj = fsspec.open(file, mode=mode, *args, **kwargs).open() File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 134, in open return self.__enter__() File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 102, in __enter__ f = self.fs.open(self.path, mode=mode) File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/spec.py", line 1199, in open f = self._open( File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 356, in _open size = size or self.info(path, **kwargs)["size"] File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 115, in wrapper return sync(self.loop, func, *args, **kwargs) File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 100, in sync raise return_result File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in _runner result[0] = await coro File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 430, in _info raise FileNotFoundError(url) from exc FileNotFoundError: https://raw.githubusercontent.com/EdinburghNLP/XSum/master/XSum-Dataset/XSum-TRAINING-DEV-TEST-SPLIT-90-5-5.json
Need help to make the dataset viewer work? Open a discussion for direct support.
Extreme Summarization (XSum) Dataset.
There are three features:
An example of 'validation' looks as follows.
{
"document": "some-body",
"id": "29750031",
"summary": "some-sentence"
}
The data fields are the same among all splits.
document
: a string
feature.summary
: a string
feature.id
: a string
feature.name | train | validation | test |
---|---|---|---|
default | 204045 | 11332 | 11334 |
@article{Narayan2018DontGM,
title={Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization},
author={Shashi Narayan and Shay B. Cohen and Mirella Lapata},
journal={ArXiv},
year={2018},
volume={abs/1808.08745}
}
Thanks to @thomwolf, @lewtun, @mariamabarham, @jbragg, @lhoestq, @patrickvonplaten for adding this dataset.