英文

Entropy-based Attention Regularization ?

这是一个经过fine-tuning的英文BERT模型,使用了 Entropy-based Attention Regularization 以减少对任务中特定词语的词汇过拟合,用于仇视女性识别任务时,可作为一个无偏见的BERT分类器的替代选择。

请参考论文以了解所有的训练细节。

数据集

该模型是在 Automatic Misogyny Identification dataset 上进行fine-tuning得到的。

模型

该模型是 bert-base-uncased 模型的fine-tuning版本。我们总共训练了三个版本,分别用于意大利语和英语。

Model Download
bert-base-uncased-ear-misogyny 1234321
bert-base-uncased-ear-mlma 1235321
bert-base-uncased-ear-misogyny-italian 1236321

作者

引用

如果您在项目中使用了此模型,请使用以下BibTeX条目进行引用:

@inproceedings{attanasio-etal-2022-entropy,
    title = "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists",
    author = "Attanasio, Giuseppe  and
      Nozza, Debora  and
      Hovy, Dirk  and
      Baralis, Elena",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2022",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.findings-acl.88",
    doi = "10.18653/v1/2022.findings-acl.88",
    pages = "1105--1119",
    abstract = "Natural Language Processing (NLP) models risk overfitting to specific terms in the training data, thereby reducing their performance, fairness, and generalizability. E.g., neural hate speech detection models are strongly influenced by identity terms like gay, or women, resulting in false positives, severe unintended bias, and lower performance.Most mitigation techniques use lists of identity terms or samples from the target domain during training. However, this approach requires a-priori knowledge and introduces further bias if important terms are neglected.Instead, we propose a knowledge-free Entropy-based Attention Regularization (EAR) to discourage overfitting to training-specific terms. An additional objective function penalizes tokens with low self-attention entropy.We fine-tune BERT via EAR: the resulting model matches or exceeds state-of-the-art performance for hate speech classification and bias metrics on three benchmark corpora in English and Italian.EAR also reveals overfitting terms, i.e., terms most likely to induce bias, to help identify their effect on the model, task, and predictions.",
}

限制

熵-注意力正则化减轻了词汇过拟合,但并不能完全消除它。我们预计该模型仍然会显示出偏见,例如,特定关键词会在任何上下文中引发特定预测。

请参考我们的论文以获取对此缓解措施的定量评估。

许可

GNU GPLv3