模型:
cardiffnlp/twitter-roberta-base-hate-latest
该模型是对 cardiffnlp/twitter-roberta-base-2022-154m 进行二进制仇恨言论分类的微调版本。使用了13个不同的英文仇恨言论数据集对模型进行了微调。
Dataset | Accuracy | Macro-F1 | Weighted-F1 |
---|---|---|---|
hatEval, SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter | 0.5831 | 0.5646 | 0.548 |
ucberkeley-dlab/measuring-hate-speech | 0.9273 | 0.9193 | 0.928 |
Detecting East Asian Prejudice on Social Media | 0.9231 | 0.6623 | 0.9428 |
Call me sexist, but | 0.9686 | 0.9203 | 0.9696 |
Predicting the Type and Target of Offensive Posts in Social Media | 0.9164 | 0.6847 | 0.9098 |
HateXplain | 0.8653 | 0.845 | 0.8662 |
Large Scale Crowdsourcing and Characterization of Twitter Abusive BehaviorLarge Scale Crowdsourcing and Characterization of Twitter Abusive Behavior | 0.7801 | 0.7446 | 0.7614 |
Multilingual and Multi-Aspect Hate Speech Analysis | 0.9944 | 0.4986 | 0.9972 |
Hate speech and offensive content identification in indo-european languages | 0.8779 | 0.6904 | 0.8706 |
Are You a Racist or Am I Seeing Things? | 0.921 | 0.8935 | 0.9216 |
Automated Hate Speech Detection | 0.9423 | 0.9249 | 0.9429 |
Hate Towards the Political Opponent | 0.8783 | 0.6595 | 0.8788 |
Hateful Symbols or Hateful People? | 0.8187 | 0.7833 | 0.8323 |
Overall | 0.8766 | 0.7531 | 0.8745 |
通过pip安装tweetnlp。
pip install tweetnlp
在python中加载模型。
import tweetnlp model = tweetnlp.Classifier("cardiffnlp/twitter-roberta-base-hate-latest") model.predict('I love everybody :)') >> {'label': 'NOT-HATE'}
@misc{antypas2023robust, title={Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation}, author={Dimosthenis Antypas and Jose Camacho-Collados}, year={2023}, eprint={2307.01680}, archivePrefix={arXiv}, primaryClass={cs.CL} }