


?SODA Synthetic Dialogue 数据集卡片


?SODA Synthetic Dialogue 是一组助手和用户之间的合成对话。在每个对话中,用户根据现有对话、故事的片段、标题或主题要求助手执行摘要或故事生成任务。

这个数据集是通过合成 ?Soda 中的对话并应用一组模版来生成对话的。原始的研究论文可以在 here 找到。


User: Can you write the next few lines of dialogue for this scene:
  Cornell: I know what you're thinking, Dontrell. I can sense your feelings and know exactly what you're going to say or do next. You're thinking about how much you miss your family, and how you wish you could see them again. You're wondering if there's anything you can do to make that happen. And you're also feeling a little guilty, because you know that if you could see them again, it would only be for a short while before they died.
  Dontrell: Cornell, how can you possibly know all that? It's like you can read my mind!
Assistant: Sure, the next dialogue for this scene could be:
  Cornell: I told you, I can sense your feelings. It's like I have a connection to your thoughts and emotions. I'm not sure why or how it works, but ever since we met, I've been able to understand what you're thinking and feeling pretty easily.
  Dontrell: Wow...that's amazing! And a little bit scary too. But it does explain why we've always been so close - even though we've only known each other for a few months.
  Cornell: Yeah, I guess our connection is pretty strong. But don't worry, Dontrell - I'll never use it to manipulate or control you in any way. I promise."
  Dontrell: That's a relief! I was starting to feel a little bit like you could read my mind, and it was kind of creepy. But if you promise not to use your abilities to control me, then I'm okay with it.
  Cornell: I swear, Dontrell. I'll never use my abilities to control or manipulate you in any way. You have my word.
User: And a title?
Assistant: Sure, a title for this dialogue could be:
  Cornell is intuitive. Cornell reads Dontrell's mind.
User: How about a short description?
Assistant: Sure, a short description for this dialogue could be:
  Cornell knows what Dontrell is thinking. He can sense her feelings and know exactly what she is going to say or do next.
User: What would be one theme of this story?
Assistant: One theme of this story could be:
  being intuitive


数据集包含一个配置项 dialogue_modeling ,其中包含一个单一文本 conversation 特征。


合成该数据集的脚本最初由 ontocord this Colab notebook 中创建,并由 Jeffrey Quesnelle 准备用于 Hugging Face。



  author    = {ontocord and Jeffrey Quesnelle},
  title     = {SODA Synthetic Dialogue},
  year      = {2023}