graviti logo产品公开数据集关于我们
登录
424
0
5
WikiMovies
创建来自Hello Dataset
概要
代码
活动

Overview

Directly reading documents and being able to answer questions from them is an unsolved challenge. To avoid its inherent difficulty, question answering (QA) has been directed towards using Knowledge Bases (KBs) instead, which has proven effective. Unfortunately KBs often suffer from being too restrictive, as the schema cannot support certain types of answers, and too sparse, e.g. Wikipedia contains much more information than Freebase. In this work we introduce a new method, Key-Value Memory Networks, that makes reading documents more viable by utilizing different encodings in the addressing and output stages of the memory read operation. TEMNLP 2016o compare using KBs, information extraction or Wikipedia documents directly in a single framework we construct an analysis tool, WIKIMOVIES, a QA dataset that contains raw text alongside a preprocessed KB, in the domain of movies. Our method reduces the gap between all three settings. It also achieves state-of-the-art results on the existing WIKIQA benchmark.

The dataset includes only the QA part of the Movie Dialog dataset, but using three different settings of knowledge: using a traditional knowledge base (KB), using Wikipedia as the source of knowledge, or using IE (information extraction) over Wikipedia. This allows to test the ability of models to directly read documents to answer questions, and to compare this to traditional KBs in the same setting. See the paper for more details:

A. H. Miller, A. Fisch, J. Dodge, A. Karimi, A. Bordes, J. Weston. Key-Value Memory Networks for Directly Reading Documents, arXiv:1606.03126.

Citation

Please use the following citation when referencing the dataset:

@article{miller2016key,
  title={Key-value memory networks for directly reading documents},
  author={Miller, Alexander and Fisch, Adam and Dodge, Jesse and Karimi, Amir-Hossein and Bordes,
Antoine and Weston, Jason},
  journal={arXiv preprint arXiv:1606.03126},
  year={2016}
}
🎉感谢Hello Dataset的贡献
数据集信息
应用场景NLP
标注类型Text
任务类型暂无
LicenseUnknown
更新时间2021-03-24 22:57:12
数据概要
数据格式Text
数据数量0
已标注数量0
文件大小54KB
版权归属方
Facebook Research
标注方
未知
了解更多和支持
立即开始构建AI
免费开始联系我们