模型:
mrm8488/t5-base-finetuned-sarcasm-twitter
在 Twitter Sarcasm Dataset 上对基础模型T5 进行了微调,用于序列分类(作为文本生成)的下游任务。
T5模型由Colin Raffel,Noam Shazeer,Adam Roberts,Katherine Lee,Sharan Narang,Michael Matena,Yanqi Zhou,Wei Li和Peter J. Liu在 Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer 年提出。以下是摘要:
转移学习是一种先在数据丰富的任务上进行预训练,然后在下游任务上进行微调的强大技术,在自然语言处理(NLP)领域得到了广泛应用。转移学习的有效性催生了各种方法、技术和实践。在本文中,我们通过引入一个统一的框架将每个语言问题转换成文本到文本的格式,来探索NLP的转移学习技术领域。我们的系统研究比较了预训练目标、架构、无标签数据集、转移方法和其他因素在数十个语言理解任务上的表现。通过将我们研究的见解与规模和我们的新的“巨大干净爬取语料库”相结合,我们在许多涵盖摘要、问答、文本分类等任务中取得了最先进的结果。为了方便未来在NLP的转移学习上的研究,我们发布了我们的数据集、预训练模型和代码。
为讽刺检测任务提供了Twitter的训练和测试数据集,以jsonlines格式呈现。
每一行包含一个具有以下字段的JSON对象:
例如,对于以下训练示例:
"label": "SARCASM", "response": "Did Kelly just call someone else messy? Baaaahaaahahahaha", "context": ["X is looking a First Lady should . #classact, "didn't think it was tailored enough it looked messy"]
回应推文"Did Kelly..."是对其直接上下文"didn't think it was tailored..."的回复,该上下文又是对"X is looking..."的回复。您的目标是预测"response"的标签,同时使用上下文(即,直接上下文或完整上下文)。
数据集大小统计:
Train | Val | Test | |
---|---|---|---|
4050 | 450 | 500 |
数据集经过预处理,转换为文本到文本(作为生成任务进行分类)。
训练脚本是 this Colab Notebook 的一个稍微修改的版本,由 Suraj Patil 创建,所有的荣誉归于他!
precision | recall | f1-score | support | |
---|---|---|---|---|
derison | 0.84 | 0.80 | 0.82 | 246 |
normal | 0.82 | 0.85 | 0.83 | 254 |
accuracy | 0.83 | 500 | ||
macro avg | 0.83 | 0.83 | 0.83 | 500 |
weighted avg | 0.83 | 0.83 | 0.83 | 500 |
from transformers import AutoTokenizer, AutoModelWithLMHead tokenizer = AutoTokenizer.from_pretrained("mrm8488/t5-base-finetuned-sarcasm-twitter") model = AutoModelWithLMHead.from_pretrained("mrm8488/t5-base-finetuned-sarcasm-twitter") def eval_conversation(text): input_ids = tokenizer.encode(text + '</s>', return_tensors='pt') output = model.generate(input_ids=input_ids, max_length=3) dec = [tokenizer.decode(ids) for ids in output] label = dec[0] return label # For similarity with the training dataset we should replace users mentions in twits for @USER token and urls for URL token. twit1 = "Trump just suspended the visa program that allowed me to move to the US to start @USER!" + " Unfortunately, I won’t be able to vote in a few months but if you can, please vote him out, " + "he's destroying what made America great in so many different ways!" twit2 = "@USER @USER @USER We have far more cases than any other country, " + "so leaving remote workers in would be disastrous. Makes Trump sense." twit3 = "My worry is that i wouldn’t be surprised if half the country actually agrees with this move..." me = "Trump doing so??? It must be a mistake... XDDD" conversation = twit1 + twit2 eval_conversation(conversation) #Output: 'derison' conversation = twit1 + twit3 eval_conversation(conversation) #Output: 'normal' conversation = twit1 + me eval_conversation(conversation) #Output: 'derison' # We will get 'normal' when sarcasm is not detected and 'derison' when detected
由 Manuel Romero/@mrm8488 | LinkedIn 创建
在西班牙用♥制作