模型:

hfl/chinese-macbert-base

英文

请使用'Bert'相关函数加载此模型!

此存储库包含我们的论文“重新审视针对中文自然语言处理的预训练模型”的资源,该论文将发布在“ Findings of EMNLP ”中。您可以通过 ACL Anthology 或 arXiv pre-print 阅读我们的终稿论文。

Revisiting Pre-trained Models for Chinese Natural Language Processing 崔一鸣,车万翔,刘挺,秦兵,王世进,胡国平

您可能还对以下内容感兴趣,

HFL的更多资源: https://github.com/ymcui/HFL-Anthology

简介

MacBERT是一个改进的BERT,具有新的MLM增强预训练任务,可以减轻预训练和微调之间的差异。

我们提出了使用类似词进行掩码而不是[MASK]标记的方法,因为[MASK]标记在微调阶段从不出现。类似词通过使用 Synonyms toolkit (Wang and Hu, 2017) 获取,该方法基于word2vec(Mikolov et al.,2013)相似性计算。如果选择对N-gram进行遮罩,我们将单独查找相似词。在罕见情况下,当没有相似词时,我们将采用随机词替换的方法。

这是我们预训练任务的一个示例。

Example
Original Sentence we use a language model to predict the probability of the next word.
MLM we use a language [M] to [M] ##di ##ct the pro [M] ##bility of the next word .
Whole word masking we use a language [M] to [M] [M] [M] the [M] [M] [M] of the next word .
N-gram masking we use a [M] [M] to [M] [M] [M] the [M] [M] [M] [M] [M] next word .
MLM as correction we use a text system to ca ##lc ##ulate the po ##si ##bility of the next word .

除了新的预训练任务,我们还采用了以下技术:

  • 整字掩码(WWM)
  • N-gram掩码
  • 句子顺序预测(SOP)

请注意,我们的MacBERT可以直接替换原始BERT,因为主要神经架构没有区别。

有关更多技术细节,请查阅我们的论文: Revisiting Pre-trained Models for Chinese Natural Language Processing

引用

如果您发现我们的资源或论文有用,请考虑在您的论文中包含以下引用。

@inproceedings{cui-etal-2020-revisiting,
    title = "Revisiting Pre-Trained Models for {C}hinese Natural Language Processing",
    author = "Cui, Yiming  and
      Che, Wanxiang  and
      Liu, Ting  and
      Qin, Bing  and
      Wang, Shijin  and
      Hu, Guoping",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.findings-emnlp.58",
    pages = "657--668",
}