数据集:

HuggingFaceH4/instruct_me

中文

Dataset card for Instruct Me

Dataset summary

Instruct Me is a dataset of prompts and instruction dialogues between a human user and AI assistant. The prompts are derived from (prompt, completion) pairs in the Helpful Instructions dataset . The goal is to train a language model to that is "chatty" and can answer the kind of questions or tasks a human user might instruct an AI assistant to perform.

Supported Tasks and Leaderboard

We provide 3 configs that can be used for training RLHF models:

instruction_tuning

Single-turn user/bot dialogues for instruction tuning.

reward_modeling

Prompts to generate model completions and collect human preference data

ppo

Prompts to generate model completions for optimization of the instruction-tuned model with techniques like PPO.

Changelog

  • March 6, 2023: v1.1.0 release. Changed the text columns for the reward_modeling and ppo configs to prompt for consistency with our dataset schemas elsewhere.
  • March 5, 2023: v1.0.0 release.