模型:

TransQuest/siamesetransquest-da-multilingual

任务:

特征提取

类库:

PyTorch Transformers

语言:

multilingual-multilingual

其他:

xlm-roberta Quality Estimation siamesetransquest da Quality+Estimation

许可:

apache-2.0

模型介绍文件清单

英文

TransQuest: 使用跨语言Transformer进行翻译质量估计

质量估计（QE）的目标是在没有参考翻译的情况下评估翻译的质量。高精度的QE可以轻松应用于多种语言对，这在许多商业翻译工作流中是缺失的一环，因为它们有许多潜在的用途。它们可用于在提供多个翻译引擎时选择最佳翻译，或者可以向最终用户提供有关自动翻译内容的可靠性的信息。此外，QE系统可用于决定是否可以在给定的上下文中发布翻译，或者是否需要在发布之前进行人工后编辑或由人工重新翻译。质量估计可以在不同的级别上进行：文档级别，句子级别和单词级别。

通过TransQuest，我们将我们在翻译质量估计方面的研究开源，并且在句子级直接评估质量估计共享任务中获得了胜利。TransQuest在性能上超过了当前的开源质量估计框架，如 OpenKiwi 和 DeepQuest 。

特点

句子级别的翻译质量估计，包括预测后编辑工作和直接评估。
单词级别的翻译质量估计，能够预测源词、目标词和目标间隙的质量。
在所有实验的语言中，优于当前最先进的质量估计方法，如DeepQuest和OpenKiwi。
提供了十五个语言对的预训练质量估计模型。

安装

使用pip安装

pip install transquest

从源代码安装

git clone https://github.com/TharinduDR/TransQuest.git
cd TransQuest
pip install -r requirements.txt

使用预训练模型

import torch
from transquest.algo.sentence_level.siamesetransquest.run_model import SiameseTransQuestModel


model = SiameseTransQuestModel("TransQuest/siamesetransquest-da-multilingual")
predictions = model.predict([["Reducerea acestor conflicte este importantă pentru conservare.", "Reducing these conflicts is not important for preservation."]])
print(predictions)

文档

更多详细信息请参阅文档。

Installation -使用pip在本地安装TransQuest。

架构-查看TransQuest实现的架构

Sentence-level Architectures -我们已发布两种架构：MonoTransQuest和SiameseTransQuest，用于执行句子级别的质量估计。

Word-level Architecture -我们已发布MicroTransQuest，用于执行单词级别的质量估计。

示例-我们提供了几个示例，介绍如何在最近的WMT质量估计共享任务中使用TransQuest

Sentence-level Examples

Word-level Examples

预训练模型-我们提供了十五个语言对的预训练质量估计模型，涵盖句子级别和单词级别

Sentence-level Models

Word-level Models

Contact -如有任何TransQuest相关问题，请与我们联系

引用

如果您使用的是单词级别的架构，请考虑引用此论文，该论文已被 ACL 2021 接受。

@InProceedings{ranasinghe2021,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {An Exploratory Analysis of Multilingual Word Level Quality Estimation with Cross-Lingual Transformers},
booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics},
year = {2021}
}

如果您使用的是句子级别的架构，请考虑引用以下论文，这些论文已在 COLING 2020 和 WMT 2020 的EMNLP 2020会议上发表。

@InProceedings{transquest:2020a,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest: Translation Quality Estimation with Cross-lingual Transformers},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
year = {2020}
}

@InProceedings{transquest:2020b,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest at WMT2020: Sentence-Level Direct Assessment},
booktitle = {Proceedings of the Fifth Conference on Machine Translation},
year = {2020}
}

作者:

TransQuest

数据集大小:

2.09 GB