数据集:
web_questions
任务:
子任务:
open-domain-qa语言:
计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
crowdsourced源数据集:
original许可:
此数据集包含6,642对问题/答案。这些问题可以通过Freebase进行回答,Freebase是一个大型知识图谱。这些问题主要围绕一个具体的命名实体。这些问题是网络上常见的问题(至少在2013年是如此)。
'train'的示例如下所示。
{
"answers": ["Jamaican Creole English Language", "Jamaican English"],
"question": "what does jamaican people speak?",
"url": "http://www.freebase.com/view/en/jamaica"
}
数据字段在所有拆分中都相同。
默认name | train | test |
---|---|---|
default | 3778 | 2032 |
@inproceedings{berant-etal-2013-semantic,
title = "Semantic Parsing on {F}reebase from Question-Answer Pairs",
author = "Berant, Jonathan and
Chou, Andrew and
Frostig, Roy and
Liang, Percy",
booktitle = "Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing",
month = oct,
year = "2013",
address = "Seattle, Washington, USA",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D13-1160",
pages = "1533--1544",
}
感谢 @thomwolf , @mariamabarham , @lewtun 添加此数据集。