数据集:

copenlu/sufficient_facts

英文

数据集卡片:sufficient_facts

数据集简介

这是论文“Fact Checking with Insufficient Evidence”介绍的数据集SufficientFacts,该论文已于2022年被TACL期刊接受。

自动化事实核查(FC)过程依赖于从外部来源获取的信息。在这项工作中,我们认为FC模型只有在有足够证据时才会做出真实性预测,否则会指示证据不足够。为此,我们首次研究了FC模型认为哪些信息足够的问题,并提出了三个主要贡献。首先,我们使用新的流利性保持方法,在组成部分和句子级别上从证据中省略信息,对任务进行了深入的实证分析。我们根据三个不同Transformer架构的训练模型和三个FC数据集确定模型什么时候认为剩余证据对于FC(不)足够。其次,我们询问标注者省略的证据对于FC是否重要,从而得到了一种用于FC的新诊断数据集SufficientFacts。我们发现,当省略的是状语修饰语时,模型在检测缺失证据时最不成功(准确率为21%),而当省略的是日期修饰语时最容易(准确率为63%)。 最后,我们提出了一种新的数据增强策略,通过使用所提出的省略方法结合三种训练来对缺失的证据进行对比性自学习。它将证据足够性预测的性能提高了多达17.8个F1分数,从而将FC的性能提高了多达2.6个F1分数。

语言

英文

数据集结构

数据集由三个文件组成,分别对应于FEVER、HoVer和VitaminC三个数据集。每个文件都由json行组成,格式如下:

{
    "claim": "Unison (Celine Dion album) was originally released by Atlantic Records.", 
    "evidence": [
        [
            "Unison (Celine Dion album)", 
            "The album was originally released on 2 April 1990 ."
        ]
    ],
    "label_before": "REFUTES", 
    "label_after": "NOT ENOUGH", 
    "agreement": "agree_ei", 
    "type": "PP", 
    "removed": ["by Columbia Records"], 
    "text_orig": "[[Unison (Celine Dion album)]] The album was originally released on 2 April 1990 <span style=\"color:red;\">by Columbia Records</span> ."
}

数据实例

  • FEVER:600个组成部分级别,400个句子级别;
  • HoVer - 600个组成部分级别,400个句子级别;
  • VitaminC - 600个组成部分级别。

数据字段

  • claim - 要验证的声明
  • evidence - 用于声明的增强证据,即删除了部分信息的证据
  • label_before - 在从证据中删除信息之前,声明-证据对的原始标签
  • label_after - 在证据中删除信息后,从众包工人标注的增强声明-证据对的标签
  • type - 从证据中删除的信息的类型。类型是细粒度的,并且它们与7个组成部分类型和1个句子类型的通用类型的映射可以在types.json文件中找到。
  • removed - 从证据中删除的信息的文本
  • text_orig - 呈现给众包工人审查的证据的原始文本,被删除的信息的文本位于标签中

数据拆分

name test_fever test_hover test_vitaminc
test 1000 1000 600

从相应数据集的测试拆分增强而来。

注释

注释过程

工作者收到了以下任务描述:

对于每个证据文本,一些事实已被删除(用红色标记)。您应该注释,给定证据文本中剩下的事实,证据是否足以验证声明。

  • 如果剩下的信息仍足够验证声明,因为删除的信息与识别证据是否支持或反驳无关,请选择“足够-无关”。“请参见示例1和示例2。
  • 如果剩下的信息仍足够验证声明,因为删除的信息是相关的,但也存在于剩余的(非红色)文本中,请选择“足够-重复”。请参见示例3。
  • 当1)删除的信息与验证声明相关且2)剩下的文本中没有(重复)出现时,请选择“不够”。请参见示例4、5和6。

注意:您不应该使用自己的知识或信念!您只能依赖为声明提供的证据。

然后,对注释实例进行了示例注释。最后,要求标注者完成一个资格测试,以便被允许为该任务注释实例。三名注释者对SufficientFacts的结果一致性为0.81,基于Fleiss'K。

谁是注释者?

注释是由亚马逊机械土耳其工人完成的。

附加信息

许可信息

MIT

引用信息

@article{10.1162/tacl_a_00486,
    author = {Atanasova, Pepa and Simonsen, Jakob Grue and Lioma, Christina and Augenstein, Isabelle},
    title = "{Fact Checking with Insufficient Evidence}",
    journal = {Transactions of the Association for Computational Linguistics},
    volume = {10},
    pages = {746-763},
    year = {2022},
    month = {07},
    abstract = "{Automating the fact checking (FC) process relies on information obtained from external sources. In this work, we posit that it is crucial for FC models to make veracity predictions only when there is sufficient evidence and otherwise indicate when it is not enough. To this end, we are the first to study what information FC models consider sufficient by introducing a novel task and advancing it with three main contributions. First, we conduct an in-depth empirical analysis of the task with a new fluency-preserving method for omitting information from the evidence at the constituent and sentence level. We identify when models consider the remaining evidence (in)sufficient for FC, based on three trained models with different Transformer architectures and three FC datasets. Second, we ask annotators whether the omitted evidence was important for FC, resulting in a novel diagnostic dataset, SufficientFacts1, for FC with omitted evidence. We find that models are least successful in detecting missing evidence when adverbial modifiers are omitted (21\\% accuracy), whereas it is easiest for omitted date modifiers (63\\% accuracy). Finally, we propose a novel data augmentation strategy for contrastive self-learning of missing evidence by employing the proposed omission method combined with tri-training. It improves performance for Evidence Sufficiency Prediction by up to 17.8 F1 score, which in turn improves FC performance by up to 2.6 F1 score.}",
    issn = {2307-387X},
    doi = {10.1162/tacl_a_00486},
    url = {https://doi.org/10.1162/tacl\_a\_00486},
    eprint = {https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl\_a\_00486/2037141/tacl\_a\_00486.pdf},
}

贡献

感谢 @apepa 添加了该数据集。