_id
string
title
string
text
string
"1"
""
"What is the step by step guide to invest in share market in india?"
"2"
""
"What is the step by step guide to invest in share market?"
"3"
""
"What is the story of Kohinoor (Koh-i-Noor) Diamond?"
"4"
""
"What would happen if the Indian government stole the Kohinoor (Koh-i-Noor) diamond back?"
"5"
""
"How can I increase the speed of my internet connection while using a VPN?"
"6"
""
"How can Internet speed be increased by hacking through DNS?"
"7"
""
"Why am I mentally very lonely? How can I solve it?"
"8"
""
"Find the remainder when [math]23^{24}[/math] is divided by 24,23?"
"9"
""
"Which one dissolve in water quikly sugar, salt, methane and carbon di oxide?"
"10"
""
"Which fish would survive in salt water?"
"11"
""
"Astrology: I am a Capricorn Sun Cap moon and cap rising...what does that say about me?"
"12"
""
"I'm a triple Capricorn (Sun, Moon and ascendant in Capricorn) What does this say about me?"
"13"
""
"Should I buy tiago?"
"14"
""
"What keeps childern active and far from phone and video games?"
"15"
""
"How can I be a good geologist?"
"16"
""
"What should I do to be a great geologist?"
"17"
""
"When do you use シ instead of し?"
"18"
""
"When do you use "&" instead of "and"?"
"19"
""
"Motorola (company): Can I hack my Charter Motorolla DCX3400?"
"20"
""
"How do I hack Motorola DCX3400 for free internet?"
"21"
""
"Method to find separation of slits using fresnel biprism?"
"22"
""
"What are some of the things technicians can tell about the durability and reliability of Laptops and its components?"
"23"
""
"How do I read and find my YouTube comments?"
"24"
""
"How can I see all my Youtube comments?"
"25"
""
"What can make Physics easy to learn?"
"26"
""
"How can you make physics easy to learn?"
"27"
""
"What was your first sexual experience like?"
"28"
""
"What was your first sexual experience?"
"29"
""
"What are the laws to change your status from a student visa to a green card in the US, how do they compare to the immigration laws in Canada?"
"30"
""
"What are the laws to change your status from a student visa to a green card in the US? How do they compare to the immigration laws in Japan?"
"31"
""
"What would a Trump presidency mean for current international master’s students on an F1 visa?"
"32"
""
"How will a Trump presidency affect the students presently in US or planning to study in US?"
"33"
""
"What does manipulation mean?"
"34"
""
"What does manipulation means?"
"35"
""
"Why do girls want to be friends with the guy they reject?"
"36"
""
"How do guys feel after rejecting a girl?"
"37"
""
"Why are so many Quora users posting questions that are readily answered on Google?"
"38"
""
"Why do people ask Quora questions which can be answered easily by Google?"
"39"
""
"Which is the best digital marketing institution in banglore?"
"40"
""
"Which is the best digital marketing institute in Pune?"
"41"
""
"Why do rockets look white?"
"42"
""
"Why are rockets and boosters painted white?"
"43"
""
"What's causing someone to be jealous?"
"44"
""
"What can I do to avoid being jealous of someone?"
"45"
""
"What are the questions should not ask on Quora?"
"47"
""
"How much is 30 kV in HP?"
"48"
""
"Where can I find a conversion chart for CC to horsepower?"
"49"
""
"What does it mean that every time I look at the clock the numbers are the same?"
"50"
""
"How many times a day do a clock’s hands overlap?"
"51"
""
"What are some tips on making it through the job interview process at Medicines?"
"52"
""
"What are some tips on making it through the job interview process at Foundation Medicine?"
"53"
""
"What is web application?"
"54"
""
"What is the web application framework?"
"55"
""
"Does society place too much importance on sports?"
"56"
""
"How do sports contribute to the society?"
"57"
""
"What is best way to make money online?"
"58"
""
"What is best way to ask for money online?"
"59"
""
"How should I prepare for CA final law?"
"60"
""
"How one should know that he/she completely prepare for CA final exam?"
"61"
""
"What's one thing you would like to do better?"
"62"
""
"What's one thing you do despite knowing better?"
"63"
""
"What are some special cares for someone with a nose that gets stuffy during the night?"
"64"
""
"How can I keep my nose from getting stuffy at night?"
"65"
""
"What Game of Thrones villain would be the most likely to give you mercy?"
"66"
""
"What Game of Thrones villain would you most like to be at the mercy of?"
"67"
""
"Does the United States government still blacklist (employment, etc.) some United States citizens because their political views?"
"68"
""
"How is the average speed of gas molecules determined?"
"69"
""
"What is the best travel website in spain?"
"70"
""
"What is the best travel website?"
"71"
""
"Why do some people think Obama will try to take their guns away?"
"72"
""
"Has there been a gun control initiative to take away guns people already own?"
"73"
""
"I'm a 19-year-old. How can I improve my skills or what should I do to become an entrepreneur in the next few years?"
"74"
""
"I am a 19 year old guy. How can I become a billionaire in the next 10 years?"
"75"
""
"When a girlfriend asks her boyfriend "Why did you choose me? What makes you want to be with me?", what should one reply to her?"
"76"
""
"My girlfriend said that we should end this because she is confused about her feelings for me. I wished her well and disconnected. Should I call her and ask her if she wants to get back together?"
"77"
""
"How do we prepare for UPSC?"
"78"
""
"How do I prepare for civil service?"
"79"
""
"What is the stall speed and AOA of an f-14 with wings fully swept back?"
"80"
""
"Why did aircraft stop using variable-sweep wings, like those on an F-14?"
"81"
""
"Why do Slavs squat?"
"82"
""
"Will squats make my legs thicker?"
"83"
""
"When can I expect my Cognizant confirmation mail?"
"84"
""
"When can I expect Cognizant confirmation mail?"
"85"
""
"Can I make 50,000 a month by day trading?"
"86"
""
"Can I make 30,000 a month by day trading?"
"87"
""
"Is being a good kid and not being a rebel worth it in the long run?"
"88"
""
"Is being bored good for a kid?"
"89"
""
"What universities does Rexnord recruit new grads from? What majors are they looking for?"
"90"
""
"What universities does B&G Foods recruit new grads from? What majors are they looking for?"
"91"
""
"What is the quickest way to increase Instagram followers?"
"92"
""
"How can we increase our number of Instagram followers?"
"93"
""
"How did Darth Vader fought Darth Maul in Star Wars Legends?"
"94"
""
"Does Quora have a character limit for profile descriptions?"
"95"
""
"What are the stages of breaking up between couple? I mean, what happens after the breaking up emotionally whether its a male or female?"
"96"
""
"Who is affected more by a breakup, the boy or the girl?"
"97"
""
"What are some examples of products that can be make from crude oil?"
"98"
""
"What are some of the products made from crude oil?"
"99"
""
"How do I make friends."
"100"
""
"How to make friends ?"
"101"
""
"Is Career Launcher good for RBI Grade B preparation?"
YAML Metadata Warning: The task_categories "zero-shot-retrieval" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, conversational, feature-extraction, text-generation, text2text-generation, fill-mask, sentence-similarity, text-to-speech, automatic-speech-recognition, audio-to-audio, audio-classification, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-retrieval, time-series-forecasting, text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, other
YAML Metadata Warning: The task_categories "information-retrieval" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, conversational, feature-extraction, text-generation, text2text-generation, fill-mask, sentence-similarity, text-to-speech, automatic-speech-recognition, audio-to-audio, audio-classification, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-retrieval, time-series-forecasting, text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, other
YAML Metadata Warning: The task_categories "zero-shot-information-retrieval" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, conversational, feature-extraction, text-generation, text2text-generation, fill-mask, sentence-similarity, text-to-speech, automatic-speech-recognition, audio-to-audio, audio-classification, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-retrieval, time-series-forecasting, text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, other
YAML Metadata Warning: The task_ids "passage-retrieval" is not in the official list: acceptability-classification, entity-linking-classification, fact-checking, intent-classification, language-identification, multi-class-classification, multi-label-classification, multi-input-text-classification, natural-language-inference, semantic-similarity-classification, sentiment-classification, topic-classification, semantic-similarity-scoring, sentiment-scoring, sentiment-analysis, hate-speech-detection, text-scoring, named-entity-recognition, part-of-speech, parsing, lemmatization, word-sense-disambiguation, coreference-resolution, extractive-qa, open-domain-qa, closed-domain-qa, news-articles-summarization, news-articles-headline-generation, dialogue-generation, dialogue-modeling, language-modeling, text-simplification, explanation-generation, abstractive-qa, open-domain-abstractive-qa, closed-domain-qa, open-book-qa, closed-book-qa, slot-filling, masked-language-modeling, keyword-spotting, speaker-identification, audio-intent-classification, audio-emotion-recognition, audio-language-identification, multi-label-image-classification, multi-class-image-classification, face-detection, vehicle-detection, instance-segmentation, semantic-segmentation, panoptic-segmentation, image-captioning, grasping, task-planning, tabular-multi-class-classification, tabular-multi-label-classification, tabular-single-column-regression, rdf-to-text, multiple-choice-qa, multiple-choice-coreference-resolution, document-retrieval, utterance-retrieval, entity-linking-retrieval, fact-checking-retrieval, univariate-time-series-forecasting, multivariate-time-series-forecasting, visual-question-answering, document-question-answering
YAML Metadata Warning: The task_ids "tweet-retrieval" is not in the official list: acceptability-classification, entity-linking-classification, fact-checking, intent-classification, language-identification, multi-class-classification, multi-label-classification, multi-input-text-classification, natural-language-inference, semantic-similarity-classification, sentiment-classification, topic-classification, semantic-similarity-scoring, sentiment-scoring, sentiment-analysis, hate-speech-detection, text-scoring, named-entity-recognition, part-of-speech, parsing, lemmatization, word-sense-disambiguation, coreference-resolution, extractive-qa, open-domain-qa, closed-domain-qa, news-articles-summarization, news-articles-headline-generation, dialogue-generation, dialogue-modeling, language-modeling, text-simplification, explanation-generation, abstractive-qa, open-domain-abstractive-qa, closed-domain-qa, open-book-qa, closed-book-qa, slot-filling, masked-language-modeling, keyword-spotting, speaker-identification, audio-intent-classification, audio-emotion-recognition, audio-language-identification, multi-label-image-classification, multi-class-image-classification, face-detection, vehicle-detection, instance-segmentation, semantic-segmentation, panoptic-segmentation, image-captioning, grasping, task-planning, tabular-multi-class-classification, tabular-multi-label-classification, tabular-single-column-regression, rdf-to-text, multiple-choice-qa, multiple-choice-coreference-resolution, document-retrieval, utterance-retrieval, entity-linking-retrieval, fact-checking-retrieval, univariate-time-series-forecasting, multivariate-time-series-forecasting, visual-question-answering, document-question-answering
YAML Metadata Warning: The task_ids "citation-prediction-retrieval" is not in the official list: acceptability-classification, entity-linking-classification, fact-checking, intent-classification, language-identification, multi-class-classification, multi-label-classification, multi-input-text-classification, natural-language-inference, semantic-similarity-classification, sentiment-classification, topic-classification, semantic-similarity-scoring, sentiment-scoring, sentiment-analysis, hate-speech-detection, text-scoring, named-entity-recognition, part-of-speech, parsing, lemmatization, word-sense-disambiguation, coreference-resolution, extractive-qa, open-domain-qa, closed-domain-qa, news-articles-summarization, news-articles-headline-generation, dialogue-generation, dialogue-modeling, language-modeling, text-simplification, explanation-generation, abstractive-qa, open-domain-abstractive-qa, closed-domain-qa, open-book-qa, closed-book-qa, slot-filling, masked-language-modeling, keyword-spotting, speaker-identification, audio-intent-classification, audio-emotion-recognition, audio-language-identification, multi-label-image-classification, multi-class-image-classification, face-detection, vehicle-detection, instance-segmentation, semantic-segmentation, panoptic-segmentation, image-captioning, grasping, task-planning, tabular-multi-class-classification, tabular-multi-label-classification, tabular-single-column-regression, rdf-to-text, multiple-choice-qa, multiple-choice-coreference-resolution, document-retrieval, utterance-retrieval, entity-linking-retrieval, fact-checking-retrieval, univariate-time-series-forecasting, multivariate-time-series-forecasting, visual-question-answering, document-question-answering
YAML Metadata Warning: The task_ids "duplication-question-retrieval" is not in the official list: acceptability-classification, entity-linking-classification, fact-checking, intent-classification, language-identification, multi-class-classification, multi-label-classification, multi-input-text-classification, natural-language-inference, semantic-similarity-classification, sentiment-classification, topic-classification, semantic-similarity-scoring, sentiment-scoring, sentiment-analysis, hate-speech-detection, text-scoring, named-entity-recognition, part-of-speech, parsing, lemmatization, word-sense-disambiguation, coreference-resolution, extractive-qa, open-domain-qa, closed-domain-qa, news-articles-summarization, news-articles-headline-generation, dialogue-generation, dialogue-modeling, language-modeling, text-simplification, explanation-generation, abstractive-qa, open-domain-abstractive-qa, closed-domain-qa, open-book-qa, closed-book-qa, slot-filling, masked-language-modeling, keyword-spotting, speaker-identification, audio-intent-classification, audio-emotion-recognition, audio-language-identification, multi-label-image-classification, multi-class-image-classification, face-detection, vehicle-detection, instance-segmentation, semantic-segmentation, panoptic-segmentation, image-captioning, grasping, task-planning, tabular-multi-class-classification, tabular-multi-label-classification, tabular-single-column-regression, rdf-to-text, multiple-choice-qa, multiple-choice-coreference-resolution, document-retrieval, utterance-retrieval, entity-linking-retrieval, fact-checking-retrieval, univariate-time-series-forecasting, multivariate-time-series-forecasting, visual-question-answering, document-question-answering
YAML Metadata Warning: The task_ids "argument-retrieval" is not in the official list: acceptability-classification, entity-linking-classification, fact-checking, intent-classification, language-identification, multi-class-classification, multi-label-classification, multi-input-text-classification, natural-language-inference, semantic-similarity-classification, sentiment-classification, topic-classification, semantic-similarity-scoring, sentiment-scoring, sentiment-analysis, hate-speech-detection, text-scoring, named-entity-recognition, part-of-speech, parsing, lemmatization, word-sense-disambiguation, coreference-resolution, extractive-qa, open-domain-qa, closed-domain-qa, news-articles-summarization, news-articles-headline-generation, dialogue-generation, dialogue-modeling, language-modeling, text-simplification, explanation-generation, abstractive-qa, open-domain-abstractive-qa, closed-domain-qa, open-book-qa, closed-book-qa, slot-filling, masked-language-modeling, keyword-spotting, speaker-identification, audio-intent-classification, audio-emotion-recognition, audio-language-identification, multi-label-image-classification, multi-class-image-classification, face-detection, vehicle-detection, instance-segmentation, semantic-segmentation, panoptic-segmentation, image-captioning, grasping, task-planning, tabular-multi-class-classification, tabular-multi-label-classification, tabular-single-column-regression, rdf-to-text, multiple-choice-qa, multiple-choice-coreference-resolution, document-retrieval, utterance-retrieval, entity-linking-retrieval, fact-checking-retrieval, univariate-time-series-forecasting, multivariate-time-series-forecasting, visual-question-answering, document-question-answering
YAML Metadata Warning: The task_ids "news-retrieval" is not in the official list: acceptability-classification, entity-linking-classification, fact-checking, intent-classification, language-identification, multi-class-classification, multi-label-classification, multi-input-text-classification, natural-language-inference, semantic-similarity-classification, sentiment-classification, topic-classification, semantic-similarity-scoring, sentiment-scoring, sentiment-analysis, hate-speech-detection, text-scoring, named-entity-recognition, part-of-speech, parsing, lemmatization, word-sense-disambiguation, coreference-resolution, extractive-qa, open-domain-qa, closed-domain-qa, news-articles-summarization, news-articles-headline-generation, dialogue-generation, dialogue-modeling, language-modeling, text-simplification, explanation-generation, abstractive-qa, open-domain-abstractive-qa, closed-domain-qa, open-book-qa, closed-book-qa, slot-filling, masked-language-modeling, keyword-spotting, speaker-identification, audio-intent-classification, audio-emotion-recognition, audio-language-identification, multi-label-image-classification, multi-class-image-classification, face-detection, vehicle-detection, instance-segmentation, semantic-segmentation, panoptic-segmentation, image-captioning, grasping, task-planning, tabular-multi-class-classification, tabular-multi-label-classification, tabular-single-column-regression, rdf-to-text, multiple-choice-qa, multiple-choice-coreference-resolution, document-retrieval, utterance-retrieval, entity-linking-retrieval, fact-checking-retrieval, univariate-time-series-forecasting, multivariate-time-series-forecasting, visual-question-answering, document-question-answering
YAML Metadata Warning: The task_ids "biomedical-information-retrieval" is not in the official list: acceptability-classification, entity-linking-classification, fact-checking, intent-classification, language-identification, multi-class-classification, multi-label-classification, multi-input-text-classification, natural-language-inference, semantic-similarity-classification, sentiment-classification, topic-classification, semantic-similarity-scoring, sentiment-scoring, sentiment-analysis, hate-speech-detection, text-scoring, named-entity-recognition, part-of-speech, parsing, lemmatization, word-sense-disambiguation, coreference-resolution, extractive-qa, open-domain-qa, closed-domain-qa, news-articles-summarization, news-articles-headline-generation, dialogue-generation, dialogue-modeling, language-modeling, text-simplification, explanation-generation, abstractive-qa, open-domain-abstractive-qa, closed-domain-qa, open-book-qa, closed-book-qa, slot-filling, masked-language-modeling, keyword-spotting, speaker-identification, audio-intent-classification, audio-emotion-recognition, audio-language-identification, multi-label-image-classification, multi-class-image-classification, face-detection, vehicle-detection, instance-segmentation, semantic-segmentation, panoptic-segmentation, image-captioning, grasping, task-planning, tabular-multi-class-classification, tabular-multi-label-classification, tabular-single-column-regression, rdf-to-text, multiple-choice-qa, multiple-choice-coreference-resolution, document-retrieval, utterance-retrieval, entity-linking-retrieval, fact-checking-retrieval, univariate-time-series-forecasting, multivariate-time-series-forecasting, visual-question-answering, document-question-answering
YAML Metadata Warning: The task_ids "question-answering-retrieval" is not in the official list: acceptability-classification, entity-linking-classification, fact-checking, intent-classification, language-identification, multi-class-classification, multi-label-classification, multi-input-text-classification, natural-language-inference, semantic-similarity-classification, sentiment-classification, topic-classification, semantic-similarity-scoring, sentiment-scoring, sentiment-analysis, hate-speech-detection, text-scoring, named-entity-recognition, part-of-speech, parsing, lemmatization, word-sense-disambiguation, coreference-resolution, extractive-qa, open-domain-qa, closed-domain-qa, news-articles-summarization, news-articles-headline-generation, dialogue-generation, dialogue-modeling, language-modeling, text-simplification, explanation-generation, abstractive-qa, open-domain-abstractive-qa, closed-domain-qa, open-book-qa, closed-book-qa, slot-filling, masked-language-modeling, keyword-spotting, speaker-identification, audio-intent-classification, audio-emotion-recognition, audio-language-identification, multi-label-image-classification, multi-class-image-classification, face-detection, vehicle-detection, instance-segmentation, semantic-segmentation, panoptic-segmentation, image-captioning, grasping, task-planning, tabular-multi-class-classification, tabular-multi-label-classification, tabular-single-column-regression, rdf-to-text, multiple-choice-qa, multiple-choice-coreference-resolution, document-retrieval, utterance-retrieval, entity-linking-retrieval, fact-checking-retrieval, univariate-time-series-forecasting, multivariate-time-series-forecasting, visual-question-answering, document-question-answering

Dataset Card for BEIR Benchmark

Dataset Summary

BEIR is a heterogeneous benchmark that has been built from 18 diverse datasets representing 9 information retrieval tasks:

All these datasets have been preprocessed and can be used for your experiments.


Supported Tasks and Leaderboards

The dataset supports a leaderboard that evaluates models against task-specific metrics such as F1 or EM, as well as their ability to retrieve supporting information from Wikipedia.

The current best performing models can be found here.

Languages

All tasks are in English (en).

Dataset Structure

All BEIR datasets must contain a corpus, queries and qrels (relevance judgments file). They must be in the following format:

  • corpus file: a .jsonl file (jsonlines) that contains a list of dictionaries, each with three fields _id with unique document identifier, title with document title (optional) and text with document paragraph or passage. For example: {"_id": "doc1", "title": "Albert Einstein", "text": "Albert Einstein was a German-born...."}
  • queries file: a .jsonl file (jsonlines) that contains a list of dictionaries, each with two fields _id with unique query identifier and text with query text. For example: {"_id": "q1", "text": "Who developed the mass-energy equivalence formula?"}
  • qrels file: a .tsv file (tab-seperated) that contains three columns, i.e. the query-id, corpus-id and score in this order. Keep 1st row as header. For example: q1 doc1 1

Data Instances

A high level example of any beir dataset:

corpus = {
    "doc1" : {
        "title": "Albert Einstein", 
        "text": "Albert Einstein was a German-born theoretical physicist. who developed the theory of relativity, \
                 one of the two pillars of modern physics (alongside quantum mechanics). His work is also known for \
                 its influence on the philosophy of science. He is best known to the general public for his mass–energy \
                 equivalence formula E = mc2, which has been dubbed 'the world's most famous equation'. He received the 1921 \
                 Nobel Prize in Physics 'for his services to theoretical physics, and especially for his discovery of the law \
                 of the photoelectric effect', a pivotal step in the development of quantum theory."
        },
    "doc2" : {
        "title": "", # Keep title an empty string if not present
        "text": "Wheat beer is a top-fermented beer which is brewed with a large proportion of wheat relative to the amount of \
                 malted barley. The two main varieties are German Weißbier and Belgian witbier; other types include Lambic (made\
                 with wild yeast), Berliner Weisse (a cloudy, sour beer), and Gose (a sour, salty beer)."
    },
}

queries = {
    "q1" : "Who developed the mass-energy equivalence formula?",
    "q2" : "Which beer is brewed with a large proportion of wheat?"
}

qrels = {
    "q1" : {"doc1": 1},
    "q2" : {"doc2": 1},
}

Data Fields

Examples from all configurations have the following features:

Corpus

  • corpus: a dict feature representing the document title and passage text, made up of:
    • _id: a string feature representing the unique document id
      • title: a string feature, denoting the title of the document.
      • text: a string feature, denoting the text of the document.

Queries

  • queries: a dict feature representing the query, made up of:
    • _id: a string feature representing the unique query id
    • text: a string feature, denoting the text of the query.

Qrels

  • qrels: a dict feature representing the query document relevance judgements, made up of:
    • _id: a string feature representing the query id
      • _id: a string feature, denoting the document id.
      • score: a int32 feature, denoting the relevance judgement between query and document.

Data Splits

Dataset Website BEIR-Name Type Queries Corpus Rel D/Q Down-load md5
MSMARCO Homepage msmarco train
dev
test
6,980 8.84M 1.1 Link 444067daf65d982533ea17ebd59501e4
TREC-COVID Homepage trec-covid test 50 171K 493.5 Link ce62140cb23feb9becf6270d0d1fe6d1
NFCorpus Homepage nfcorpus train
dev
test
323 3.6K 38.2 Link a89dba18a62ef92f7d323ec890a0d38d
BioASQ Homepage bioasq train
test
500 14.91M 8.05 No How to Reproduce?
NQ Homepage nq train
test
3,452 2.68M 1.2 Link d4d3d2e48787a744b6f6e691ff534307
HotpotQA Homepage hotpotqa train
dev
test
7,405 5.23M 2.0 Link f412724f78b0d91183a0e86805e16114
FiQA-2018 Homepage fiqa train
dev
test
648 57K 2.6 Link 17918ed23cd04fb15047f73e6c3bd9d9
Signal-1M(RT) Homepage signal1m test 97 2.86M 19.6 No How to Reproduce?
TREC-NEWS Homepage trec-news test 57 595K 19.6 No How to Reproduce?
ArguAna Homepage arguana test 1,406 8.67K 1.0 Link 8ad3e3c2a5867cdced806d6503f29b99
Touche-2020 Homepage webis-touche2020 test 49 382K 19.0 Link 46f650ba5a527fc69e0a6521c5a23563
CQADupstack Homepage cqadupstack test 13,145 457K 1.4 Link 4e41456d7df8ee7760a7f866133bda78
Quora Homepage quora dev
test
10,000 523K 1.6 Link 18fb154900ba42a600f84b839c173167
DBPedia Homepage dbpedia-entity dev
test
400 4.63M 38.2 Link c2a39eb420a3164af735795df012ac2c
SCIDOCS Homepage scidocs test 1,000 25K 4.9 Link 38121350fc3a4d2f48850f6aff52e4a9
FEVER Homepage fever train
dev
test
6,666 5.42M 1.2 Link 5a818580227bfb4b35bb6fa46d9b6c03
Climate-FEVER Homepage climate-fever test 1,535 5.42M 3.0 Link 8b66f0a9126c521bae2bde127b4dc99d
SciFact Homepage scifact train
test
300 5K 1.1 Link 5f7d1de60b170fc8027bb7898e2efca1
Robust04 Homepage robust04 test 249 528K 69.9 No How to Reproduce?

Dataset Creation

Curation Rationale

[Needs More Information]

Source Data

Initial Data Collection and Normalization

[Needs More Information]

Who are the source language producers?

[Needs More Information]

Annotations

Annotation process

[Needs More Information]

Who are the annotators?

[Needs More Information]

Personal and Sensitive Information

[Needs More Information]

Considerations for Using the Data

Social Impact of Dataset

[Needs More Information]

Discussion of Biases

[Needs More Information]

Other Known Limitations

[Needs More Information]

Additional Information

Dataset Curators

[Needs More Information]

Licensing Information

[Needs More Information]

Citation Information

Cite as:

@inproceedings{
thakur2021beir,
title={{BEIR}: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models},
author={Nandan Thakur and Nils Reimers and Andreas R{\"u}ckl{\'e} and Abhishek Srivastava and Iryna Gurevych},
booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)},
year={2021},
url={https://openreview.net/forum?id=wCu6T5xFjeJ}
}

Contributions

Thanks to @Nthakur20 for adding this dataset.

Downloads last month
135

Models trained or fine-tuned on BeIR/quora