从roberta-large微调的模型，用于金融新闻的情绪分类（重点关注加拿大新闻）。

介绍

该模型是在financial_news_sentiment_mixte_with_phrasebank_75数据集上训练的。这是一个经过定制的phrasebank数据集的版本，只保留了至少75%注释者验证的句子。此外，我还手动添加了大约2000篇加拿大金融新闻的验证文章，因此该模型更专门针对加拿大新闻进行训练。最终结果为整体f1得分93.25%，加拿大新闻为83.6%。

训练数据

训练数据分类如下：

class	Description
0	negative
1	neutral
2	positive

如何使用roberta-large-financial-news-sentiment-en和HuggingFace

加载roberta-large-financial-news-sentiment-en及其子词标记器：

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")
model = AutoModelForSequenceClassification.from_pretrained("Jean-Baptiste/roberta-large-financial-news-sentiment-en")


##### Process text sample (from wikipedia)

from transformers import pipeline

pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
pipe("Melcor REIT (TSX: MR.UN) today announced results for the third quarter ended September 30, 2022. Revenue was stable in the quarter and year-to-date. Net operating income was down 3% in the quarter at $11.61 million due to the timing of operating expenses and inflated costs including utilities like gas/heat and power")

[{'label': 'negative', 'score': 0.9399105906486511}]

模型性能

整体f1得分（平均宏观）

precision	recall	f1
0.9355	0.9299	0.9325

按实体划分

entity	precision	recall	f1
negative	0.9605	0.9240	0.9419
neutral	0.9538	0.9459	0.9498
positive	0.8922	0.9200	0.9059

作者:

JB Polle

数据集大小:

3.98 GB