数据集:
web_questions
任务:
问答子任务:
open-domain-qa语言:
en计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
crowdsourced源数据集:
original许可:
license:unknown此数据集包含6,642对问题/答案。这些问题可以通过Freebase进行回答,Freebase是一个大型知识图谱。这些问题主要围绕一个具体的命名实体。这些问题是网络上常见的问题(至少在2013年是如此)。
'train'的示例如下所示。
{ "answers": ["Jamaican Creole English Language", "Jamaican English"], "question": "what does jamaican people speak?", "url": "http://www.freebase.com/view/en/jamaica" }
数据字段在所有拆分中都相同。
默认name | train | test |
---|---|---|
default | 3778 | 2032 |
@inproceedings{berant-etal-2013-semantic, title = "Semantic Parsing on {F}reebase from Question-Answer Pairs", author = "Berant, Jonathan and Chou, Andrew and Frostig, Roy and Liang, Percy", booktitle = "Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing", month = oct, year = "2013", address = "Seattle, Washington, USA", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/D13-1160", pages = "1533--1544", }
感谢 @thomwolf , @mariamabarham , @lewtun 添加此数据集。