模型:

cardiffnlp/bert-base-multilingual-cased-sentiment-multilingual

英文

cardiffnlp/bert-base-multilingual-cased-sentiment-multilingual

这个模型是在 cardiffnlp/tweet_sentiment_multilingual (all) 上通过 tweetnlp 进行微调的 bert-base-multilingual-cased 的版本。训练数据集的划分是 train,参数在验证数据集 validation 上进行了调优。

在测试数据集 test 上( link ),达到了以下指标:

  • F1值(微平均):0.6169540229885058
  • F1值(宏平均):0.6168385894019698
  • 准确率:0.6169540229885058

使用方法

通过pip安装tweetnlp。

pip install tweetnlp

在Python中加载该模型。

import tweetnlp
model = tweetnlp.Classifier("cardiffnlp/bert-base-multilingual-cased-sentiment-multilingual", max_length=128)
model.predict('Get the all-analog Classic Vinyl Edition of "Takin Off" Album from {@herbiehancock@} via {@bluenoterecords@} link below {{URL}}')

参考资料

@inproceedings{dimosthenis-etal-2022-twitter,
    title = "{T}witter {T}opic {C}lassification",
    author = "Antypas, Dimosthenis  and
    Ushio, Asahi  and
    Camacho-Collados, Jose  and
    Neves, Leonardo  and
    Silva, Vitor  and
    Barbieri, Francesco",
    booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
    month = oct,
    year = "2022",
    address = "Gyeongju, Republic of Korea",
    publisher = "International Committee on Computational Linguistics"
}