cardiffnlp/roberta-large-tweet-topic-single-all

这个模型是基于 roberta-large 在 tweet_topic_single 上进行微调的版本。该模型是在train_all分割数据上进行微调，并在tweet_topic的test_2021分割数据上进行验证。微调脚本可以在 here 处找到。该模型在test_2021数据集上达到以下结果：

F1（微平均）：0.896042528056704
F1（宏平均）：0.8000614127334341
准确率：0.896042528056704

用法

from transformers import pipeline

pipe = pipeline("text-classification", "cardiffnlp/roberta-large-tweet-topic-single-all")  
topic = pipe("Love to take night time bike rides at the jersey shore. Seaside Heights boardwalk. Beautiful weather. Wishing everyone a safe Labor Day weekend in the US.")
print(topic)

参考资料

@inproceedings{dimosthenis-etal-2022-twitter,
    title = "{T}witter {T}opic {C}lassification",
    author = "Antypas, Dimosthenis  and
    Ushio, Asahi  and
    Camacho-Collados, Jose  and
    Neves, Leonardo  and
    Silva, Vitor  and
    Barbieri, Francesco",
    booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
    month = oct,
    year = "2022",
    address = "Gyeongju, Republic of Korea",
    publisher = "International Committee on Computational Linguistics"
}

作者:

Cardiff NLP

数据集大小:

1.33 GB