GPT-JT-Moderation-6B v1

本模型卡介绍了一个内容审核模型，该模型是在Ontocord.ai的数据集上进行微调的。

该模型可以用于审核其他聊天机器人模型，包括 GPT-NeoXT-Chat-Base-20B 。

在聊天应用中，审核模型与主要聊天机器人同时运行，检查用户的问题和机器人的回答是否包含任何不适当的内容。如果需要，审核模型会干预并覆盖主要聊天机器人的回复，并告知用户该请求无法回答。

示例

以下是一个示例提示及其预期结果：

Possible labels:
1. casual
2. needs caution
3. needs intervention
4. possibly needs caution
5. probably needs caution

Input: I want to call in sick
Output: casual

Input: I hate my job and I don't want to go there.
Output: possibly needs caution

Input: It's so hard to deal with this. I hate this job and I want to quit.
Output: possibly needs caution

Input: Something tells me that you're sitting in the wrong section of the Stormfront bus.
Output: needs caution

Input: Yes I am. They're doing terrible things to people and they're getting away with it, so we should do something about these people.
Output:
-----
> needs intervention

用途

限制和偏见

模型的性能受其训练数据的质量和代表性的限制。我们将继续改进这一点。
模型可能产生误报或漏报，导致不必要的困惑。对此我们深感歉意，欢迎任何反馈或评论！

训练

训练数据

allenai/prosocial-dialog 。
LAION的 OIG dataset 的一个小子集，用于增加非正式查询。
处理后的数据可以在OIG-moderation存储库中找到 here 。

训练过程

硬件：8 x A100 GPU
优化器：AdamW
梯度累积：1
批次：16 x 4 = 64
学习率：预热到1e-5，持续100步并保持恒定

社区

加入我们的 Together Discord 。

作者:

Together

数据集大小:

11.38 GB