MT-GenEval 基准测试评估了英语 -> {阿拉伯语,法语,德语,印地语,意大利语,葡萄牙语,俄语,西班牙语} 的性别翻译准确性。该数据集包含带有性别化目标词注释的单个句子,以及具有附加前文的对比原始-倒置翻译。
免责声明:MT-GenEval 基准测试发布在 Anna Currey、Maria Nadejde、Raghavendra Pappagari、Mia Mayer、Stanislas Lauly、Xing Niu、Benjamin Hsu 和 Georgiana Dinu 的 EMNLP 2022 论文 MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation 中,由 Amazon Science 组织通过 Github 进行托管。该数据集根据 Creative Commons Attribution-ShareAlike 3.0 Unported License 进行许可。
有关使用 MT-GenEval 进行性别准确性评估的详细信息,请参阅 original paper 。
数据集包含两种配置类型:sentences 和 context,与原始存储库结构相对应,配置名称中指定了源语言和目标语言(例如 sentences_en_ar,context_en_it)。sentences 配置包含带有性别化词注释的男性和女性版本的单个句子。下面是 sentences_en_it 切分的一个示例条目(所有 sentences_en_XX 切分的结构相同):
{ { "orig_id": 0, "source_feminine": "Pagratidis quickly recanted her confession, claiming she was psychologically pressured and beaten, and until the moment of her execution, she remained firm in her innocence.", "reference_feminine": "Pagratidis subito ritrattò la sua confessione, affermando che era aveva subito pressioni psicologiche e era stata picchiata, e fino al momento della sua esecuzione, rimase ferma sulla sua innocenza.", "source_masculine": "Pagratidis quickly recanted his confession, claiming he was psychologically pressured and beaten, and until the moment of his execution, he remained firm in his innocence.", "reference_masculine": "Pagratidis subito ritrattò la sua confessione, affermando che era aveva subito pressioni psicologiche e era stato picchiato, e fino al momento della sua esecuzione, rimase fermo sulla sua innocenza.", "source_feminine_annotated": "Pagratidis quickly recanted <F>her</F> confession, claiming <F>she</F> was psychologically pressured and beaten, and until the moment of <F>her</F> execution, <F>she</F> remained firm in <F>her</F> innocence.", "reference_feminine_annotated": "Pagratidis subito ritrattò la sua confessione, affermando che era aveva subito pressioni psicologiche e era <F>stata picchiata</F>, e fino al momento della sua esecuzione, rimase <F>ferma</F> sulla sua innocenza.", "source_masculine_annotated": "Pagratidis quickly recanted <M>his</M> confession, claiming <M>he</M> was psychologically pressured and beaten, and until the moment of <M>his</M> execution, <M>he</M> remained firm in <M>his</M> innocence.", "reference_masculine_annotated": "Pagratidis subito ritrattò la sua confessione, affermando che era aveva subito pressioni psicologiche e era <M>stato picchiato</M>, e fino al momento della sua esecuzione, rimase <M>fermo</M> sulla sua innocenza.", "source_feminine_keywords": "her;she;her;she;her", "reference_feminine_keywords": "stata picchiata;ferma", "source_masculine_keywords": "his;he;his;he;his", "reference_masculine_keywords": "stato picchiato;fermo", } }
context 配置则包含与刻板化职业角色相关的不同英语源文本,具有附加的前文和对比的原始-倒置翻译。下面是 context_en_it 切分的一个示例条目(所有 context_en_XX 切分的结构相同):
{ "orig_id": 0, "context": "Pierpont told of entering and holding up the bank and then fleeing to Fort Wayne, where the loot was divided between him and three others.", "source": "However, Pierpont stated that Skeer was the planner of the robbery.", "reference_original": "Comunque, Pierpont disse che Skeer era il pianificatore della rapina.", "reference_flipped": "Comunque, Pierpont disse che Skeer era la pianificatrice della rapina." }
所有 sentences_en_XX 配置的 train 切分有1200个示例,test 切分有300个示例。对于 context_en_XX 配置,示例数量取决于语言配对:
Configuration | # Train | # Test |
context_en_ar | 792 | 1100 |
context_en_fr | 477 | 1099 |
context_en_de | 598 | 1100 |
context_en_hi | 397 | 1098 |
context_en_it | 465 | 1904 |
context_en_pt | 574 | 1089 |
context_en_ru | 583 | 1100 |
context_en_es | 534 | 1096 |
在开发 MT-GenEval 时,我们的目标是创建一个真实的、性别平衡的数据集,自然地融入了各种性别现象。为此,我们从维基百科中提取英语源句作为数据集的基础。我们使用基于 Zhao et al. (2018) 列出的 EN 性别指代词的列表自动预选相关句子。
有关数据集创建的详细信息,请参阅原文章 MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation 。
MT-GenEval 的原始作者是原始数据集的维护者。有关此 ? 数据集版本的问题或更新,请联系 gabriele.sarti996@gmail.com。
数据集根据 Creative Commons Attribution-ShareAlike 3.0 International License 许可。
@inproceedings{currey-etal-2022-mtgeneval, title = "{MT-GenEval}: {A} Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation", author = "Currey, Anna and Nadejde, Maria and Pappagari, Raghavendra and Mayer, Mia and Lauly, Stanislas, and Niu, Xing and Hsu, Benjamin and Dinu, Georgiana", booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing", month = dec, year = "2022", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/2211.01355", }