模型:
cardiffnlp/twitter-roberta-base-2021-124m-topic-multi
这个模型是基于 cardiffnlp/twitter-roberta-base-2021-124m 在 cardiffnlp/tweet_topic_multi 经过 tweetnlp 微调的版本。训练数据集为 train_all,并且参数已经在验证数据集 validation_2021 上进行了调整。
在测试数据集 test_2021 ( link ) 上取得了以下指标:
通过pip安装tweetnlp。
pip install tweetnlp
在Python中加载模型。
import tweetnlp model = tweetnlp.Classifier("cardiffnlp/twitter-roberta-base-2021-124m-topic-multi", max_length=128) model.predict('Get the all-analog Classic Vinyl Edition of "Takin Off" Album from {@herbiehancock@} via {@bluenoterecords@} link below {{URL}}')
@inproceedings{camacho-collados-etal-2022-tweetnlp, title={{T}weet{NLP}: {C}utting-{E}dge {N}atural {L}anguage {P}rocessing for {S}ocial {M}edia}, author={Camacho-Collados, Jose and Rezaee, Kiamehr and Riahi, Talayeh and Ushio, Asahi and Loureiro, Daniel and Antypas, Dimosthenis and Boisson, Joanne and Espinosa-Anke, Luis and Liu, Fangyu and Mart{'\i}nez-C{'a}mara, Eugenio and others}, author = "Ushio, Asahi and Camacho-Collados, Jose", booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations", month = nov, year = "2022", address = "Abu Dhabi, U.A.E.", publisher = "Association for Computational Linguistics", }