Dataset Preview
Viewer
The full dataset viewer is not available (click to read why). Only showing a preview of the rows.
An error occurred while generating the dataset
Error code:   UnexpectedError

Need help to make the dataset viewer work? Open a discussion for direct support.

info
dict
license
dict
data_subtype
string
annotations
list
data_type
string
task_type
string
questions
list
{"description":"This is v2.0 of the VQA dataset.","url":"http://visualqa.org","version":"2.0","year"(...TRUNCATED)
{"url":"http://creativecommons.org/licenses/by/4.0/","name":"Creative Commons Attribution 4.0 Intern(...TRUNCATED)
"train2014"
[{"question_type":"what is this","multiple_choice_answer":"net","image_id":458752,"answer_type":"oth(...TRUNCATED)
"mscoco"
null
null
{"description":"This is v2.0 of the VQA dataset.","url":"http://visualqa.org","version":"2.0","year"(...TRUNCATED)
{"url":"http://creativecommons.org/licenses/by/4.0/","name":"Creative Commons Attribution 4.0 Intern(...TRUNCATED)
"train2014"
null
"mscoco"
"Open-Ended"
[{"image_id":458752,"question":"What is this photo taken looking through?","question_id":458752000},(...TRUNCATED)
{"description":"This is v2.0 of the VQA dataset.","url":"http://visualqa.org","version":"2.0","year"(...TRUNCATED)
{"url":"http://creativecommons.org/licenses/by/4.0/","name":"Creative Commons Attribution 4.0 Intern(...TRUNCATED)
"train2014"
[{"question_type":"what is this","multiple_choice_answer":"net","answers":[{"answer":"net","answer_c(...TRUNCATED)
"mscoco"
null
null

VQAv2 in Vietnamese

This is Google-translated version of VQAv2 in Vietnamese. The process of building Vietnamese version as follows:

  • In en/ folder,
    • Download v2_OpenEnded_mscoco_train2014_questions.json and v2_mscoco_train2014_annotations.json from VQAv2.
    • Remove key answers of key annotations from v2_mscoco_train2014_annotations.json. I shall use key multiple_choice_answer of key annotations only. Let call the new file v2_OpenEnded_mscoco_train2014_answers.json
    • By using set data structure, I generate question_list.txt and answer_list.txt of unique text. There are 152050 unique questions and 22531 unique answers from 443757 image-question-answer triplets.
  • In vi/ folder,
    • By translating two en/.txt files, I generate answer_list.jsonl and question_list.jsonl. In each of entry of each file, the key is the original english text, the value is the translated text in vietnamese.

To load Vietnamese version in your code, you need original English version. Then just use English text as key to retrieve Vietnamese value from answer_list.jsonl and question_list. I provide both English and Vietnamese version.

Please refer to this code to apply translation.

Downloads last month
2
Edit dataset card
Evaluate models HF Leaderboard