graviti logo产品公开数据集关于我们
登录
266
0
5
SimpleQuestions v2
创建来自Hello Dataset
概要
代码
活动

Overview

TheSimpleQuestions, a dataset collected for research in automatic question answering with human generated questions. Details and baseline results on this dataset can be found in the paper:

Antoine Bordes, Nicolas Usunier, Sumit Chopra and Jason Weston. Large-Scale Simple Question answering with Memory Networks, arXiv:1506.02075.

The dataset consists of a total of 108,442 questions written in natural language by human English-speaking annotators each paired with a corresponding fact, formatted as (subject, relationship, object), that provides the answer but also a complete explanation. Facts have been extracted from the Knowledge Base Freebase. We randomly shuffle these questions and use 70% of them (75910) as training set, 10% as validation set (10845), and the remaining 20% as test set.

Here are some examples of questions and facts:

* What American cartoonist is the creator of Andy Lippincott?
  Fact: (andy_lippincott, character_created_by, garry_trudeau)
* Which forest is Fires Creek in?
  Fact: (fires_creek, containedby, nantahala_national_forest)
* What does Jimmy Neutron do?
  Fact: (jimmy_neutron, fictional_character_occupation, inventor)
* What dietary restriction is incompatible with kimchi?
  Fact: (kimchi, incompatible_with_dietary_restrictions, veganism)

Citation

Please use the following citation when referencing the dataset:

@article{bordes2015large,
  title={Large-scale simple question answering with memory networks},
  author={Bordes, Antoine and Usunier, Nicolas and Chopra, Sumit and Weston, Jason},
  journal={arXiv preprint arXiv:1506.02075},
  year={2015}
}
🎉感谢Hello Dataset的贡献
数据集信息
应用场景NLP
标注类型Text
任务类型暂无
LicenseUnknown
更新时间2021-03-24 22:56:20
数据概要
数据格式Text
数据数量0
已标注数量0
文件大小404KB
版权归属方
Facebook Research
标注方
未知
了解更多和支持
相关数据集
立即开始构建AI
免费开始联系我们