hakonmh/sentiment-xdistil-uncased | ATYUN.COM 官网-人工智能教程资讯全方位服务平台

模型:

hakonmh/sentiment-xdistil-uncased

任务:

文本分类

类库:

PyTorch Safetensors Transformers

语言:

其他:

bert finance financial-sentiment-analysis sentiment-analysis

预印本库:

arxiv:2303.15056

许可:

mit

模型介绍文件清单

英文

Sentiment-xDistil是基于 xtremedistil-l12-h384-uncased 进行微调的模型，用于对由 Chat GPT 3.5 注释的新闻标题进行情感分类。它与 Topic-xDistil 一起构建，作为过滤金融新闻标题并分类其情感的工具。用于训练这两个模型和构建数据集的代码可以在 here 找到。

注意：输出标签可以是负面、中性或正面。该模型适用于英语。

性能结果

以下是测试集上两个模型的性能指标：

Model	Test Set Size	Accuracy	F1 Score
topic-xdistil-uncased	32 799	94.44 %	92.59 %
sentiment-xdistil-uncased	17 527	94.59 %	93.44 %

数据

训练数据包括300k+的新闻标题和推文，由 Chat GPT 3.5 注释，已经显示出 outperform crowd-workers for text annotation tasks 。

Chat GPT提示定义了句子标签如下：

"""
[...]
Does the headline convey a Positive, Neutral, or Negative sentiment with \
regard to the current state or potential future impact on the economy or \
the asset described?
    - Positive sentiment headlines suggest growth, improvement, or \
stability in economic conditions.
    - Neutral sentiment headlines do not clearly indicate a positive or \
negative impact on the economy.
    - Negative sentiment headlines imply economic decline, uncertainty, \
or unfavorable conditions.
[...]
"""

示例用法

这里是一个简单的示例：

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("hakonmh/sentiment-xdistil-uncased")
tokenizer = AutoTokenizer.from_pretrained("hakonmh/sentiment-xdistil-uncased")

SENTENCE = "Global Growth Surges as New Technologies Drive Innovation and Productivity!"
inputs = tokenizer(SENTENCE, return_tensors="pt")
output = model(**inputs).logits
predicted_label = model.config.id2label[output.argmax(-1).item()]

print(predicted_label)

Positive

或者，与Topic-xDistil一起作为一个流水线：

from transformers import pipeline

topic_classifier = pipeline("sentiment-analysis",
                            model="hakonmh/topic-xdistil-uncased",
                            tokenizer="hakonmh/topic-xdistil-uncased")
sentiment_classifier = pipeline("sentiment-analysis",
                                model="hakonmh/sentiment-xdistil-uncased",
                                tokenizer="hakonmh/sentiment-xdistil-uncased")

SENTENCE = "Global Growth Surges as New Technologies Drive Innovation and Productivity!"
print(topic_classifier(SENTENCE))
print(sentiment_classifier(SENTENCE))

[{'label': 'Economics', 'score': 0.9970171451568604}]
[{'label': 'Positive', 'score': 0.9997037053108215}]

作者:

Håkon Magne Holmen

数据集大小:

255.52 MB