d4data/bias-detection-model | ATYUN.COM 官网-人工智能教程资讯全方位服务平台

模型:

d4data/bias-detection-model

任务:

文本分类

类库:

TensorFlow Transformers

语言:

其他:

distilbert Text Classification Carbon Emissions Text+Classification

模型介绍文件清单

英文

关于模型

这是一个用于检测句子（新闻文章）中的偏见和公正的英文序列分类模型，使用MBAD数据集进行训练。该模型是基于distilbert-base-uncased模型构建的，训练了30个epochs，批量大小为16，学习率为5e-5，最大序列长度为512。

数据集: MBAD数据
碳排放量0.319355公斤

Train Accuracy	Validation Accuracy	Train loss	Test loss
76.97	62.00	0.45	0.96

使用方法

最简单的方法是从huggingface加载推断api，第二种方法是使用transformers库提供的pipeline对象。

from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("d4data/bias-detection-model")
model = TFAutoModelForSequenceClassification.from_pretrained("d4data/bias-detection-model")

classifier = pipeline('text-classification', model=model, tokenizer=tokenizer) # cuda = 0,1 based on gpu availability
classifier("The irony, of course, is that the exhibit that invites people to throw trash at vacuuming Ivanka Trump lookalike reflects every stereotype feminists claim to stand against, oversexualizing Ivanka’s body and ignoring her hard work.")

作者

此模型是Deepak John Reji和Shaina Raza进行的“AI中的偏见与公正”研究课题的一部分。如果您使用了这个工作（代码、模型或数据集），请在以下GitHub存储库上给个星：

Bias & Fairness in AI, (2022), GitHub存储库， https://github.com/dreji18/Fairness-in-AI

作者:

D 4 Data Community

数据集大小:

255.76 MB