数据集:

Thaweewat/instruction-wild-52k-th

中文

Summary

This is a ?? Thai-instructed dataset translated from InstructionWild using Google Cloud Translation. It contains 52,191 English and 51,504 Chinese instructions, which are collected from Twitter, where users tend to share their interesting prompts of mostly generation, open QA, and mind-storm types which also be used by Colossal AI to train the ColossalChat model.

Supported Tasks:

  • Training LLMs
  • Synthetic Data Generation
  • Data Augmentation

Languages: Thai Version: 1.0