数据集:
openbookqa
任务:
问答子任务:
open-domain-qa语言:
en计算机处理:
monolingual大小:
1K<n<10K语言创建人:
expert-generated源数据集:
original许可:
license:unknownOpenBookQA 旨在促进高级问答研究,深入探索主题的理解(有显著事实作为开放书籍概述,数据集中也提供了这些信息)以及表达这一理解的语言的深入理解。具体而言,它包含需要多步推理、使用额外的常识和常识知识以及丰富的文本理解的问题。OpenBookQA是一种新型的问答数据集,以考察人类对主题的理解,模仿了开放书籍考试的形式。
"train" 的示例如下所示:
{'id': '7-980', 'question_stem': 'The sun is responsible for', 'choices': {'text': ['puppies learning new tricks', 'children growing up and getting old', 'flowers wilting in a vase', 'plants sprouting, blooming and wilting'], 'label': ['A', 'B', 'C', 'D']}, 'answerKey': 'D'}additional
"train" 的示例如下所示:
{'id': '7-980', 'question_stem': 'The sun is responsible for', 'choices': {'text': ['puppies learning new tricks', 'children growing up and getting old', 'flowers wilting in a vase', 'plants sprouting, blooming and wilting'], 'label': ['A', 'B', 'C', 'D']}, 'answerKey': 'D', 'fact1': 'the sun is the source of energy for physical cycles on Earth', 'humanScore': 1.0, 'clarity': 2.0, 'turkIdAnonymized': 'b356d338b7'}
所有拆分的数据字段相同。
mainname | train | validation | test |
---|---|---|---|
main | 4957 | 500 | 500 |
additional | 4957 | 500 | 500 |
@inproceedings{OpenBookQA2018, title={Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering}, author={Todor Mihaylov and Peter Clark and Tushar Khot and Ashish Sabharwal}, booktitle={EMNLP}, year={2018} }
感谢 @thomwolf 、 @patrickvonplaten 、 @lewtun 添加了此数据集。