英文

论述标记预测 / 论述连词预测预训练模型

以Discovery数据集上进行论述标记预测的roberta-base预训练模型,验证准确率达到30.93%(多数类为0.57%)

https://github.com/sileod/discovery

https://huggingface.co/datasets/discovery

该模型还可用作自然语言理解、语用学和论述任务的预训练模型

引用和作者

@inproceedings{sileo-etal-2019-mining,
    title = "Mining Discourse Markers for Unsupervised Sentence Representation Learning",
    author = "Sileo, Damien  and
      Van De Cruys, Tim  and
      Pradel, Camille  and
      Muller, Philippe",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1351",
    doi = "10.18653/v1/N19-1351",
    pages = "3477--3486",
}