image
image
question
string
answer
string
"are regions of the brain infarcted?"
"yes"
"are the lungs normal appearing?"
"no"
"which organ system is abnormal in this image?"
"cardiovascular"
"is the lesion causing significant brainstem herniation?"
"no"
"how was this image taken?"
"mri"
"what is the condition of the patient"
"blind loop syndrome"
"what abnormality is seen?"
"blind-ending loop of bowel arising from the cecum"
"what is the location of the mass?"
"pineal region"
"where is the mass?"
"pineal region"
"is this image in the transverse plane?"
"yes"
"are there any pulmonary findings?"
"no"
"are the lungs affected?"
"no"
"in which lobe are the lesions?"
"bilateral frontal lobes"
"where are the lesions?"
"bilateral frontal lobes"
"is there herniation of the brainstem secondary to the lesion"
"no"
"what type of image is this?"
"mri-dwi"
"is the gyral enhancement?"
"yes"
"what is the location of the abnormality?"
"right colon"
"where is the abnormal finding?"
"right colon"
"what plane is seen?"
"pa"
"how would you describe the mass?"
"isointense"
"what are the characteristics of the mass?"
"isointense"
"which plane is this image taken in?"
"pa"
"is the mass hyperintense or hypointense?"
"hyperintense"
"what is the radiological description of the mass?"
"hyperintense"
"is it difficult to delineate the left costophrenic angle?"
"yes"
"what allows for the bright areas seen in this image?"
"iv contrast"
"is this an mri?"
"no"
"is the mass heterogenous?"
"yes"
"is the mass heterogenous in consistency?"
"yes"
"are there abnormalities with in the contrast between white and grey matter?"
"yes"
"is the gray-white matter junction altered?"
"yes"
"is there evidence of inflammation?"
"yes"
"is any structure inflamed?"
"yes"
"is the image in the axial plane?"
"yes"
"how can the distribution in this image be described?"
"micronodular"
"is there blunting of the left costophrenic angle?"
"yes"
"what type of distributions is seen on this image?"
"micronodular"
"what is the location of the mass?"
"left rectus abdominus"
"where is the mass?"
"left rectus abdominus"
"is this a t1 weighted image?"
"no"
"is the right renal vein visualized?"
"no"
"is it possible to identify the right renal vein?"
"no"
"are the kidneys visible in this image?"
"yes"
"is air visible in the bowels?"
"yes"
"is there air in the bowel?"
"yes"
"what do the two bright dots represent?"
"nipple location"
"are structures associated with the midbrain located in this image?"
"yes"
"what are the two hyperlucent inferior dots?"
"nipple location"
"what are the locations of the hyperintensities?"
"bilateral frontal lobes and body of corpus callosum"
"where are the hyperintensities located?"
"bilateral frontal lobes and body of corpus callosum"
"are the kidneys visualized?"
"yes"
"can you see the kidneys"
"yes"
"is this a pa or ap film?"
"pa"
"what laterality is this film?"
"pa"
"what artery is the embolus from?"
"distal basilar artery"
"where is the embolus located?"
"distal basilar artery"
"the mass is located in what region?"
"suprasellar"
"is the midbrain identified in this section?"
"yes"
"where is the mass located?"
"suprasellar"
"the imaging modality is...?"
"mri/flair"
"what is the image modality?"
"mri/flair"
"is there fluid in the lungs?"
"no"
"is a pleural effusion present?"
"no"
"is this an axial cut?"
"yes"
"is the trachea deviated?"
"yes"
"is tracheal deviation present?"
"yes"
"what image modality is this?"
"ct"
"what imaging modality was used"
"ct"
"are there more than 5 enlarged (>1 cm) lymph nodes around the stomach"
"yes"
"are patchy infiltrates depicted?"
"yes"
"are there patchy infiltrates present?"
"yes"
"are the margins of the cardiac silhouette normal?"
"no"
"is the heart border normal?"
"no"
"is this an axial plane"
"yes"
"is there intrahepatic ductal dilatation?"
"yes"
"is intrahepatic ductal dilatation present?"
"yes"
"what imaging modality was used?"
"x-ray"
"what type of image is this?"
"x-ray"
"are air fluid levels depicted in this image?"
"yes"
"are there >5 lymph nodes located near the stomach?"
"yes"
"is there an air fluid level present?"
"yes"
"where is the pathology located?"
"anterior cerebrum"
"which region of the brain is impacted?"
"anterior surface"
"is there air in the bowel?"
"yes"
"is air present in the bowel?"
"yes"
"what is behind the liver?"
"ascites"
"what finding is associated with liver?"
"ascites"
"what type of image is this?"
"mri-flair"
"what imaging modality was used"
"mri-flair"
"what imaging plane is depicted here?"
"axial"
"does the gallbladder appear distended?"
"yes"
"what is the plane?"
"axial"
"is cardiomegaly shown?"
"yes"
"is the heart enlarged?"
"yes"
"what is the location of the mass?"
"right kidney"
"where is the mass located?"
"right kidney"
"how was this image taken?"
"with contrast"
"what do the hyperintensities likely represent?"
"hemorrhage"
"what are the hyperintensities signaling?"
"hemorrhage"

Dataset Card for VQA-RAD

Dataset Description

VQA-RAD is a dataset of question-answer pairs on radiology images. The dataset is intended to be used for training and testing Medical Visual Question Answering (VQA) systems. The dataset includes both open-ended questions and binary "yes/no" questions. The dataset is built from MedPix, which is a free open-access online database of medical images. The question-answer pairs were manually generated by a team of clinicians.

Homepage: Open Science Framework Homepage
Paper: A dataset of clinically generated visual questions and answers about radiology images
Leaderboard: Papers with Code Leaderboard

Dataset Summary

The dataset was downloaded from the Open Science Framework Homepage on June 3, 2023. The dataset contains 2,248 question-answer pairs and 315 images. Out of the 315 images, 314 images are referenced by a question-answer pair, while 1 image is not used. The training set contains 3 duplicate image-question-answer triplets. The training set also has 1 image-question-answer
triplet in common with the test set. After dropping these 4 image-question-answer triplets from the training set, the dataset contains 2,244 question-answer pairs on 314 images.

Supported Tasks and Leaderboards

This dataset has an active leaderboard on Papers with Code where models are ranked based on three metrics: "Close-ended Accuracy", "Open-ended accuracy" and "Overall accuracy". "Close-ended Accuracy" is the accuracy of a model's generated answers for the subset of binary "yes/no" questions. "Open-ended accuracy" is the accuracy of a model's generated answers for the subset of open-ended questions. "Overall accuracy" is the accuracy of a model's generated answers across all questions.

Languages

The question-answer pairs are in English.

Dataset Structure

Data Instances

Each instance consists of an image-question-answer triplet.

{
  'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=566x555>,
  'question': 'are regions of the brain infarcted?',
  'answer': 'yes'
}

Data Fields

  • 'image': the image referenced by the question-answer pair.
  • 'question': the question about the image.
  • 'answer': the expected answer.

Data Splits

The dataset is split into training and test. The split is provided directly by the authors.

Training Set Test Set
QAs 1,793 451
Images 313 203

Additional Information

Licensing Information

The authors have released the dataset under the CC0 1.0 Universal License.

Citation Information

@article{lau2018dataset,
    title={A dataset of clinically generated visual questions and answers about radiology images},
    author={Lau, Jason J and Gayen, Soumya and Ben Abacha, Asma and Demner-Fushman, Dina},
    journal={Scientific data},
    volume={5},
    number={1},
    pages={1--10},
    year={2018},
    publisher={Nature Publishing Group}
}
Downloads last month
624