Datasets:
Tasks:
Translation
Multilinguality:
translation
Size Categories:
1K<n<10K
Language Creators:
found
Annotations Creators:
found
Source Datasets:
extended|opus_books
License:
afl-3.0
The dataset viewer is not available for this split.
Cannot extract the features (columns) for the split 'train' of the config 'default' of the dataset.
Error code: FeaturesError Exception: ValueError Message: Not able to read records in the JSON file at hf://datasets/VanessaSchenkel/opus_books_en_pt@7dd7ea5bc04520e2d01b963a15830ebff6e5db4b/opus.json. You should probably indicate the field of the JSON file containing your records. This JSON file contain the following fields: ['data']. Select the correct one and provide it as `field='XXX'` to the dataset loading method. Traceback: Traceback (most recent call last): File "/src/services/worker/src/worker/job_runners/split/first_rows_from_streaming.py", line 162, in compute_first_rows_response iterable_dataset = iterable_dataset._resolve_features() File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 2206, in _resolve_features features = _infer_features_from_batch(self.with_format(None)._head()) File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1230, in _head return _examples_to_batch(list(self.take(n))) File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1379, in __iter__ for key, example in ex_iterable: File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1039, in __iter__ yield from islice(self.ex_iterable, self.n) File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 281, in __iter__ for key, pa_table in self.generate_tables_fn(**self.kwargs): File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/packaged_modules/json/json.py", line 161, in _generate_tables raise ValueError( ValueError: Not able to read records in the JSON file at hf://datasets/VanessaSchenkel/opus_books_en_pt@7dd7ea5bc04520e2d01b963a15830ebff6e5db4b/opus.json. You should probably indicate the field of the JSON file containing your records. This JSON file contain the following fields: ['data']. Select the correct one and provide it as `field='XXX'` to the dataset loading method.
Need help to make the dataset viewer work? Open a discussion for direct support.
How to use it:
from datasets import load_dataset
remote_dataset = load_dataset("VanessaSchenkel/opus_books_en_pt", field="data")
remote_dataset
Output:
DatasetDict({
train: Dataset({
features: ['id', 'translation'],
num_rows: 1404
})
})
Exemple:
remote_dataset["train"][5]
Output:
{'id': '5',
'translation': {'en': "There was nothing so very remarkable in that; nor did Alice think it so very much out of the way to hear the Rabbit say to itself, 'Oh dear!",
'pt': 'Não havia nada de tão extraordinário nisso; nem Alice achou assim tão fora do normal ouvir o Coelho dizer para si mesmo: —"Oh, céus!'}}
- Downloads last month
- 59