数据集:

lucasmccabe-lmi/oig_small_chip2_python

中文

Dataset Card for "oig_small_chip2_python"

Dataset Summary

From LAION's Open Instruction Generalist (OIG) dataset , we use a 4775-prompt segment pertaining to Python code generation. OIG text elements are formatted as dialogue exerpts between a "human" and "bot" agent. The code generation prompt is parsed from the initial "human" agent's statement and the resultant response from the "bot" agent's statement. We then reformat the text/response pairs according to the format of the original Alpaca dataset; that is, instruction/input/output triplets. In cases where the instruction field does not specify the code language, we provide "Write the code in Python" in the input field. Otherwise, the input field is left blank.

The OIG dataset was prepared by LAION, and released under the Apache 2.0 license.

Numbers: