Dataset Viewer
Viewer
The dataset viewer is not available for this split.
Cannot load the dataset split (in streaming mode) to extract the first rows.
Error code:   StreamingRowsError
Exception:    FileNotFoundError
Message:      https://minio.clarin-pl.eu/semrel/corpora/ner_export_json/ner_tele_export.json
Traceback:    Traceback (most recent call last):
                File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 417, in _info
                  await _file_info(
                File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 837, in _file_info
                  r.raise_for_status()
                File "/src/services/worker/.venv/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 1005, in raise_for_status
                  raise ClientResponseError(
              aiohttp.client_exceptions.ClientResponseError: 503, message='Service Unavailable', url=URL('https://minio.clarin-pl.eu/semrel/corpora/ner_export_json/ner_tele_export.json')
              
              The above exception was the direct cause of the following exception:
              
              Traceback (most recent call last):
                File "/src/services/worker/src/worker/utils.py", line 264, in get_rows_or_raise
                  return get_rows(
                File "/src/services/worker/src/worker/utils.py", line 205, in decorator
                  return func(*args, **kwargs)
                File "/src/services/worker/src/worker/utils.py", line 242, in get_rows
                  rows_plus_one = list(itertools.islice(ds, rows_max_number + 1))
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1379, in __iter__
                  for key, example in ex_iterable:
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 233, in __iter__
                  yield from self.generate_examples_fn(**self.kwargs)
                File "/tmp/modules-cache/datasets_modules/datasets/bprec/7dc37fa0b20500f00cfbb735415afd219cb3be8142cb0d2a8aedf8195350fa0e/bprec.py", line 187, in _generate_examples
                  with open(filepath, "r", encoding="utf-8") as f:
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/streaming.py", line 74, in wrapper
                  return function(*args, download_config=download_config, **kwargs)
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/download/streaming_download_manager.py", line 496, in xopen
                  file_obj = fsspec.open(file, mode=mode, *args, **kwargs).open()
                File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 134, in open
                  return self.__enter__()
                File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/core.py", line 102, in __enter__
                  f = self.fs.open(self.path, mode=mode)
                File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/spec.py", line 1199, in open
                  f = self._open(
                File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 356, in _open
                  size = size or self.info(path, **kwargs)["size"]
                File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 115, in wrapper
                  return sync(self.loop, func, *args, **kwargs)
                File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 100, in sync
                  raise return_result
                File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in _runner
                  result[0] = await coro
                File "/src/services/worker/.venv/lib/python3.9/site-packages/fsspec/implementations/http.py", line 430, in _info
                  raise FileNotFoundError(url) from exc
              FileNotFoundError: https://minio.clarin-pl.eu/semrel/corpora/ner_export_json/ner_tele_export.json

Need help to make the dataset viewer work? Open a discussion for direct support.

Dataset Card for [Dataset Name]

Dataset Summary

Brand-Product Relation Extraction Corpora in Polish

Supported Tasks and Leaderboards

NER, Entity linking

Languages

Polish

Dataset Structure

Data Instances

[More Information Needed]

Data Fields

  • id: int identifier of a text
  • text: string text, for example a consumer comment on the social media
  • ner: extracted entities and their relationship
    • source and target: a pair of entities identified in the text
      • from: int value representing starting character of the entity
      • text: string value with the entity text
      • to: int value representing end character of the entity
      • type: one of pre-identified entity types:
        • PRODUCT_NAME
        • PRODUCT_NAME_IMP
        • PRODUCT_NO_BRAND
        • BRAND_NAME
        • BRAND_NAME_IMP
        • VERSION
        • PRODUCT_ADJ
        • BRAND_ADJ
        • LOCATION
        • LOCATION_IMP

Data Splits

No train/validation/test split provided. Current dataset configurations point to 4 domain categories for the texts:

  • tele
  • electro
  • cosmetics
  • banking

Dataset Creation

Curation Rationale

[More Information Needed]

Source Data

Initial Data Collection and Normalization

[More Information Needed]

Who are the source language producers?

[More Information Needed]

Annotations

Annotation process

[More Information Needed]

Who are the annotators?

[More Information Needed]

Personal and Sensitive Information

[More Information Needed]

Considerations for Using the Data

Social Impact of Dataset

[More Information Needed]

Discussion of Biases

[More Information Needed]

Other Known Limitations

[More Information Needed]

Additional Information

Dataset Curators

[More Information Needed]

Licensing Information

[More Information Needed]

Citation Information

@inproceedings{inproceedings,
author = {Janz, Arkadiusz and Kopociński, Łukasz and Piasecki, Maciej and Pluwak, Agnieszka},
year = {2020},
month = {05},
pages = {},
title = {Brand-Product Relation Extraction Using Heterogeneous Vector Space Representations}
}

Contributions

Thanks to @kldarek for adding this dataset.

Downloads last month
1,353
Edit dataset card
Evaluate models HF Leaderboard