数据集:

pszemraj/dolly_hhrlhf-text2text

英文

dolly_hhrlhf-text2text

This is mosaicml/dolly_hhrlhf with the following changes:

  • clean up/adapt prompt column for the text2text-generation task (no need for a special template)
  • split the original train set into a 95% train and an explicit validation set (5%)
  • fixed extra spaces in punctuation (as this is not a French dataset)

details on extra spaces:

Original sentence 1: How can I be healthy ?
Fixed sentence 1: How can I be healthy?
以上内容进行了如下更改: - 为文本生成任务(text2text-generation)整理/修改了提示列(无需特殊模板) - 将原始的训练集分为95%的训练集和5%的显式验证集 - 修复了标点符号中的额外空格问题(因为这不是一个法语数据集) 关于额外空格的详细信息:
Original sentence 1: How can I be healthy ?
Fixed sentence 1: How can I be healthy?