数据集:
quoref
任务:
语言:
计算机处理:
monolingual大小:
10K<n<100K语言创建人:
found批注创建人:
crowdsourced源数据集:
original许可:
Quoref是一个问答数据集,用于测试阅读理解系统的指代推理能力。该数据集包含来自维基百科的4.7K个段落,共有24K个问题,系统必须在选择段落中适当的跨度来回答问题之前解决指代关系。
'验证'示例如下所示。
This example was too long and was cropped: { "answers": { "answer_start": [1633], "text": ["Frankie"] }, "context": "\"Frankie Bono, a mentally disturbed hitman from Cleveland, comes back to his hometown in New York City during Christmas week to ...", "id": "bfc3b34d6b7e73c0bd82a009db12e9ce196b53e6", "question": "What is the first name of the person who has until New Year's Eve to perform a hit?", "title": "Blast of Silence", "url": "https://en.wikipedia.org/wiki/Blast_of_Silence" }
所有拆分的数据字段都相同。
defaultname | train | validation |
---|---|---|
default | 19399 | 2418 |
@article{allenai:quoref, author = {Pradeep Dasigi and Nelson F. Liu and Ana Marasovic and Noah A. Smith and Matt Gardner}, title = {Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning}, journal = {arXiv:1908.05803v2 }, year = {2019}, }
感谢 @lewtun , @patrickvonplaten , @thomwolf 添加了此数据集。