数据集:
dair-ai/emotion
任务:
文本分类语言:
en计算机处理:
monolingual大小:
10K<n<100K语言创建人:
machine-generated批注创建人:
machine-generated源数据集:
original许可:
other"情感" 数据集是一个包含六种基本情感(愤怒、恐惧、喜悦、爱、伤心、惊讶)的英文 Twitter 消息数据集。更详细的信息请参考相关论文。
一个示例如下所示。
{ "text": "im feeling quite sad and sorry for myself but ill snap out of it soon", "label": 0 }
数据字段包括:
数据集有两个配置:
name | train | validation | test |
---|---|---|---|
split | 16000 | 2000 | 2000 |
unsplit | 416809 | n/a | n/a |
该数据集仅限用于教育和研究目的。
如果您使用了该数据集,请引用:
@inproceedings{saravia-etal-2018-carer, title = "{CARER}: Contextualized Affect Representations for Emotion Recognition", author = "Saravia, Elvis and Liu, Hsien-Chi Toby and Huang, Yen-Hao and Wu, Junlin and Chen, Yi-Shin", booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing", month = oct # "-" # nov, year = "2018", address = "Brussels, Belgium", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/D18-1404", doi = "10.18653/v1/D18-1404", pages = "3687--3697", abstract = "Emotions are expressed in nuanced ways, which varies by collective or individual experiences, knowledge, and beliefs. Therefore, to understand emotion, as conveyed through text, a robust mechanism capable of capturing and modeling different linguistic nuances and phenomena is needed. We propose a semi-supervised, graph-based algorithm to produce rich structural descriptors which serve as the building blocks for constructing contextualized affect representations from text. The pattern-based representations are further enriched with word embeddings and evaluated through several emotion recognition tasks. Our experimental results demonstrate that the proposed method outperforms state-of-the-art techniques on emotion recognition tasks.", }