数据集:

facebook/babi_qa

任务:

问答

语言:

en

计算机处理:

monolingual

语言创建人:

machine-generated

批注创建人:

machine-generated

源数据集:

original

其他:

chained-qa

许可:

cc-by-3.0
英文

bAbi QA数据集卡

数据集摘要

bAbi QA(20)问题回答任务是一组通过问题回答评估阅读理解的代理任务。我们的任务通过多种方式衡量理解能力:系统是否能够通过链式事实、简单归纳、演绎等方式回答问题。这些任务旨在成为与人类交谈的任何系统的先决条件。目的是将这些任务分类为技能集,以便研究人员可以识别(然后纠正)其系统的缺陷。

支持的任务和排行榜

该数据集支持一组基于故事的问题回答任务,共20个不同类型的英语和印地语的任务。

task_no task_name
qa1 single-supporting-fact
qa2 two-supporting-facts
qa3 three-supporting-facts
qa4 two-arg-relations
qa5 three-arg-relations
qa6 yes-no-questions
qa7 counting
qa8 lists-sets
qa9 simple-negation
qa10 indefinite-knowledge
qa11 basic-coreference
qa12 conjunction
qa13 compound-coreference
qa14 time-reasoning
qa15 basic-deduction
qa16 basic-induction
qa17 positional-reasoning
qa18 size-reasoning
qa19 path-finding
qa20 agents-motivations

这些“类型”是:

  • < p > en

    • 英语任务,可供人类阅读。
  • hn

    • 印地语任务,可供人类阅读。
  • shuffled

    • 相同的任务,字母被打乱,无法被人类阅读,对于现有的解析器和标签器不能直接使用额外资源,这种情况下学习器更多地依赖于给定的训练数据。这模拟了学习者首次接触一种语言并且不得不从头开始学习的情况。
  • en-10k ,shuffled-10k 和 hn-10k

    • 这三种格式的相同任务,但训练示例数量为10,000个,而不是1,000个。
  • en-valid 和 en-valid-10k

    • 与 en 和 en10k 相同,唯一区别是训练集已方便地划分为训练部分和验证部分(90%和10%划分)。

要获取特定的数据集,请使用 下面的代码:load_dataset('babi_qa',type=f'{type}',task_no=f'{task_no}')其中type是其中一个类型,task_no是任务号。例如,load_dataset('babi_qa',type='en',task_no='qa1')

语言

数据集结构

数据实例

en-qa1的训练集示例:

{'story': {'answer': ['', '', 'bathroom', '', '', 'hallway', '', '', 'hallway', '', '', 'office', '', '', 'bathroom'], 'id': ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15'], 'supporting_ids': [[], [], ['1'], [], [], ['4'], [], [], ['4'], [], [], ['11'], [], [], ['8']], 'text': ['Mary moved to the bathroom.', 'John went to the hallway.', 'Where is Mary?', 'Daniel went back to the hallway.', 'Sandra moved to the garden.', 'Where is Daniel?', 'John moved to the office.', 'Sandra journeyed to the bathroom.', 'Where is Daniel?', 'Mary moved to the hallway.', 'Daniel travelled to the office.', 'Where is Daniel?', 'John went back to the garden.', 'John moved to the bedroom.', 'Where is Sandra?'], 'type': [0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1]}}

数据字段

  • story:包含以下内容的字典特征:
    • id:字符串特征,表示示例中的行号。
    • type:分类标签,可能的值包括context、question,表示文本是上下文还是问题。
    • text:字符串特征,表示文本内容,无论是问题还是上下文。
    • supporting_ids:包含支持答案的示例中的行号的字符串特征的列表。
    • answer:包含问题的答案的字符串特征,如果type不是question,则为空字符串。

数据拆分

拆分和相应的大小如下:

train test validation
en-qa1 200 200 -
en-qa2 200 200 -
en-qa3 200 200 -
en-qa4 1000 1000 -
en-qa5 200 200 -
en-qa6 200 200 -
en-qa7 200 200 -
en-qa8 200 200 -
en-qa9 200 200 -
en-qa10 200 200 -
en-qa11 200 200 -
en-qa12 200 200 -
en-qa13 200 200 -
en-qa14 200 200 -
en-qa15 250 250 -
en-qa16 1000 1000 -
en-qa17 125 125 -
en-qa18 198 199 -
en-qa19 1000 1000 -
en-qa20 94 93 -
en-10k-qa1 2000 200 -
en-10k-qa2 2000 200 -
en-10k-qa3 2000 200 -
en-10k-qa4 10000 1000 -
en-10k-qa5 2000 200 -
en-10k-qa6 2000 200 -
en-10k-qa7 2000 200 -
en-10k-qa8 2000 200 -
en-10k-qa9 2000 200 -
en-10k-qa10 2000 200 -
en-10k-qa11 2000 200 -
en-10k-qa12 2000 200 -
en-10k-qa13 2000 200 -
en-10k-qa14 2000 200 -
en-10k-qa15 2500 250 -
en-10k-qa16 10000 1000 -
en-10k-qa17 1250 125 -
en-10k-qa18 1978 199 -
en-10k-qa19 10000 1000 -
en-10k-qa20 933 93 -
en-valid-qa1 180 200 20
en-valid-qa2 180 200 20
en-valid-qa3 180 200 20
en-valid-qa4 900 1000 100
en-valid-qa5 180 200 20
en-valid-qa6 180 200 20
en-valid-qa7 180 200 20
en-valid-qa8 180 200 20
en-valid-qa9 180 200 20
en-valid-qa10 180 200 20
en-valid-qa11 180 200 20
en-valid-qa12 180 200 20
en-valid-qa13 180 200 20
en-valid-qa14 180 200 20
en-valid-qa15 225 250 25
en-valid-qa16 900 1000 100
en-valid-qa17 113 125 12
en-valid-qa18 179 199 19
en-valid-qa19 900 1000 100
en-valid-qa20 85 93 9
en-valid-10k-qa1 1800 200 200
en-valid-10k-qa2 1800 200 200
en-valid-10k-qa3 1800 200 200
en-valid-10k-qa4 9000 1000 1000
en-valid-10k-qa5 1800 200 200
en-valid-10k-qa6 1800 200 200
en-valid-10k-qa7 1800 200 200
en-valid-10k-qa8 1800 200 200
en-valid-10k-qa9 1800 200 200
en-valid-10k-qa10 1800 200 200
en-valid-10k-qa11 1800 200 200
en-valid-10k-qa12 1800 200 200
en-valid-10k-qa13 1800 200 200
en-valid-10k-qa14 1800 200 200
en-valid-10k-qa15 2250 250 250
en-valid-10k-qa16 9000 1000 1000
en-valid-10k-qa17 1125 125 125
en-valid-10k-qa18 1781 199 197
en-valid-10k-qa19 9000 1000 1000
en-valid-10k-qa20 840 93 93
hn-qa1 200 200 -
hn-qa2 200 200 -
hn-qa3 167 167 -
hn-qa4 1000 1000 -
hn-qa5 200 200 -
hn-qa6 200 200 -
hn-qa7 200 200 -
hn-qa8 200 200 -
hn-qa9 200 200 -
hn-qa10 200 200 -
hn-qa11 200 200 -
hn-qa12 200 200 -
hn-qa13 125 125 -
hn-qa14 200 200 -
hn-qa15 250 250 -
hn-qa16 1000 1000 -
hn-qa17 125 125 -
hn-qa18 198 198 -
hn-qa19 1000 1000 -
hn-qa20 93 94 -
hn-10k-qa1 2000 200 -
hn-10k-qa2 2000 200 -
hn-10k-qa3 1667 167 -
hn-10k-qa4 10000 1000 -
hn-10k-qa5 2000 200 -
hn-10k-qa6 2000 200 -
hn-10k-qa7 2000 200 -
hn-10k-qa8 2000 200 -
hn-10k-qa9 2000 200 -
hn-10k-qa10 2000 200 -
hn-10k-qa11 2000 200 -
hn-10k-qa12 2000 200 -
hn-10k-qa13 1250 125 -
hn-10k-qa14 2000 200 -
hn-10k-qa15 2500 250 -
hn-10k-qa16 10000 1000 -
hn-10k-qa17 1250 125 -
hn-10k-qa18 1977 198 -
hn-10k-qa19 10000 1000 -
hn-10k-qa20 934 94 -
shuffled-qa1 200 200 -
shuffled-qa2 200 200 -
shuffled-qa3 200 200 -
shuffled-qa4 1000 1000 -
shuffled-qa5 200 200 -
shuffled-qa6 200 200 -
shuffled-qa7 200 200 -
shuffled-qa8 200 200 -
shuffled-qa9 200 200 -
shuffled-qa10 200 200 -
shuffled-qa11 200 200 -
shuffled-qa12 200 200 -
shuffled-qa13 200 200 -
shuffled-qa14 200 200 -
shuffled-qa15 250 250 -
shuffled-qa16 1000 1000 -
shuffled-qa17 125 125 -
shuffled-qa18 198 199 -
shuffled-qa19 1000 1000 -
shuffled-qa20 94 93 -
shuffled-10k-qa1 2000 200 -
shuffled-10k-qa2 2000 200 -
shuffled-10k-qa3 2000 200 -
shuffled-10k-qa4 10000 1000 -
shuffled-10k-qa5 2000 200 -
shuffled-10k-qa6 2000 200 -
shuffled-10k-qa7 2000 200 -
shuffled-10k-qa8 2000 200 -
shuffled-10k-qa9 2000 200 -
shuffled-10k-qa10 2000 200 -
shuffled-10k-qa11 2000 200 -
shuffled-10k-qa12 2000 200 -
shuffled-10k-qa13 2000 200 -
shuffled-10k-qa14 2000 200 -
shuffled-10k-qa15 2500 250 -
shuffled-10k-qa16 10000 1000 -
shuffled-10k-qa17 1250 125 -
shuffled-10k-qa18 1978 199 -
shuffled-10k-qa19 10000 1000 -
shuffled-10k-qa20 933 93 -

数据集创建

策划理由

[需要更多信息]

源数据

数据收集和规范化的初始数据

可用于生成任务的代码在 github

语言生产者是谁?

[需要更多信息]

注释

注释过程

[需要更多信息]

注释者是谁?

[需要更多信息]

个人和敏感信息

[需要更多信息]

使用数据的注意事项

数据集的社会影响

[需要更多信息]

偏见讨论

[需要更多信息]

其他已知限制

[需要更多信息]

附加信息

数据集策展人

Jesse Dodge、Andreea Gane、Xiang Zhang、Antoine Bordes、Sumit Chopra、Alexander Miller、Arthur Szlam和Jason Weston在Facebook研究院工作。

许可信息

Creative Commons Attribution 3.0 License

引用信息

@misc{dodge2016evaluating,
      title={Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems}, 
      author={Jesse Dodge and Andreea Gane and Xiang Zhang and Antoine Bordes and Sumit Chopra and Alexander Miller and Arthur Szlam and Jason Weston},
      year={2016},
      eprint={1511.06931},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

贡献

感谢 @gchhablani 添加了这个数据集。