数据集:
daily_dialog
任务:
文本分类语言:
en计算机处理:
monolingual大小:
10K<n<100K语言创建人:
found批注创建人:
expert-generated源数据集:
original许可:
cc-by-nc-sa-4.0We develop a high-quality multi-turn dialog dataset, DailyDialog, which is intriguing in several aspects. The language is human-written and less noisy. The dialogues in the dataset reflect our daily communication way and cover various topics about our daily life. We also manually label the developed dataset with communication intention and emotion information. Then, we evaluate existing approaches on DailyDialog dataset and hope it benefit the research field of dialog systems.
An example of 'validation' looks as follows.
This example was too long and was cropped: { "act": [2, 1, 1, 1, 1, 2, 3, 2, 3, 4], "dialog": "[\"Good afternoon . This is Michelle Li speaking , calling on behalf of IBA . Is Mr Meng available at all ? \", \" This is Mr Meng ...", "emotion": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] }
The data fields are the same among all splits.
defaultname | train | validation | test |
---|---|---|---|
default | 11118 | 1000 | 1000 |
Dataset provided for research purposes only. Please check dataset license for additional information.
DailyDialog dataset is licensed under CC BY-NC-SA 4.0 .
@InProceedings{li2017dailydialog, author = {Li, Yanran and Su, Hui and Shen, Xiaoyu and Li, Wenjie and Cao, Ziqiang and Niu, Shuzi}, title = {DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset}, booktitle = {Proceedings of The 8th International Joint Conference on Natural Language Processing (IJCNLP 2017)}, year = {2017} }