Alpaca GPT4英语到意大利语翻译指令（正在进行中）

该数据集包含使用gpt-3.5-turbo将15209条指令从英语翻译成意大利语的数据。

Alpaca GPT4：原始的alpaca_gpt4_data.json数据集包含由GPT-4生成的52K条遵循指令的数据，其中使用的提示为Alpaca。JSON文件的格式与Alpaca数据相同，只是输出由GPT-4生成：

instruction: str，描述模型应执行的任务。这52K个指令中的每一个都是唯一的。
input: str，任务的可选上下文或输入。
output: str，作为GPT-4生成的指令的答案。

这些指令来自Alpaca GPT4数据集，并使用以下提示进行翻译：

Act as an unrivaled English-to-Italian Translator. Your task is to translate the given passage into Italian, as you are a native Italian speaker. Each message passage contains an instruction, with an optional input (preceded by [|IN|]) and an output ([|OUT|]). You MUST provide accurate and fluent Italian translations. When translating the instruction use the second person singular. Translate each section. Keep [|IN|] and [|OUT|] placeholders. If the input or output doesn't make sense in Italian, revise them.

ENGLISH:
"""
{passage}
"""

ITALIAN:

其中'passage'表示原始的英文文本。输入被构造成一种格式，使得gpt-3.5-turbo能够理解上下文。它们被翻译为意大利语，同时保留原始英文指令的上下文，并以以下方式格式化：：<instruction>（可选[| IN |]<input>）[| OUT |] <output>。

许可证

请注意，原始的Alpaca GPT4数据集和gpt-3.5-turbo生成的翻译可能具有各自的许可证，重要的是遵守原始数据源指定的任何使用限制。由于该数据集包含部分翻译的数据，建议正确归属并遵守相关许可证。该数据集仅供研究使用。该数据集是CC BY NC 4.0许可的（仅允许非商业使用），使用该数据集训练的模型不得在研究目的之外使用。

引用

@article{peng2023instruction,
  title={Instruction Tuning with GPT-4},
  author={Peng, Baolin and Li, Chunyuan and He, Pengcheng and Galley, Michel and Gao, Jianfeng},
  journal={arXiv preprint arXiv:2304.03277},
  year={2023}
}

作者:

efederici

数据集大小:

12.98 MB