graviti logo产品公开数据集关于我们
Demo演示登录
446
0
17
OK-VQA
概要
讨论
代码
活动
33c6bf12-8ce2-11eb-b816-506b4b419b4c
1580ad5·
Jun 28, 2021 12:13 AM
·1Commits

Overview

OK-VQA is a new dataset for visual question answering that requires methods which can draw upon outside knowledge to answer questions.

  • 14,055 open-ended questions
  • 5 ground truth answers per question
  • Manually filtered to ensure all questions require outside knowledge (e.g. from Wikipeida)
  • Reduced questions with most common answers to reduce dataset bias

Data Format

Input Questions Format

The questions are stored using the JSON file format.

The questions format has the following data structure:

{
"info" : info,
"task_type" : str,
"data_type": str,
"data_subtype": str,
"questions" : [question],
"license" : license
}

info {
"year" : int,
"version" : str,
"description" : str,
"contributor" : str,
"url" : str,
"date_created" : datetime
}

license{
"name" : str,
"url" : str
}

question{
"question_id" : int,
"image_id" : int,
"question" : str
}
  • task_type: type of annotations in the JSON file (OpenEnded).
  • data_type: source of the images (mscoco or abstract_v002).
  • data_subtype: type of data subtype (e.g. train2014/val2014/test2015 for mscoco, train2015/val2015 for abstract_v002).

Annotation Format

The annotations are stored using the JSON file format.

The annotations format has the following data structure:

{
"info" : info,
"data_type": str,
"data_subtype": str,
"annotations" : [annotation],
"license" : license
}

info {
"year" : int,
"version" : str,
"description" : str,
"contributor" : str,
"url" : str,
"date_created" : datetime
}

license{
"name" : str,
"url" : str
}

annotation{
"question_id" : int,
"image_id" : int,
"question_type" : str,
"answer_type" : str,
"answers" : [answer],
"multiple_choice_answer" : str
}

answer{
"answer_id" : int,
"answer" : str,
"answer_confidence": str
}
  • data_type: source of the images (mscoco or abstract_v002).

  • data_subtype: type of data subtype (e.g. train2014/val2014/test2015 for mscoco, train2015/val2015 for abstract_v002).

  • question_type: type of the question determined by the first few words of the question. For details, please see README.

  • answer_type: type of the answer. Currently, "yes/no", "number", and "other".

  • multiple_choice_answer: most frequent ground-truth answer.

  • answer_confidence:

    subject's confidence in answering the question. For details, please see Antol et al., ICCV 2015.

🎉感谢Hello Dataset的贡献
数据集信息
应用场景暂无
标注类型暂无
任务类型暂无
LicenseCustom
更新时间2020-12-31 17:30:16
数据概要
数据格式暂无
数据数量0
已标注数量0
文件大小19MB
版权归属方
Allen Institute for artificial intelligence
标注方
未知
了解更多和支持
立即开始构建AI
免费开始联系我们