模型:

togethercomputer/GPT-JT-Moderation-6B

英文

GPT-JT-Moderation-6B v1

本模型卡介绍了一个内容审核模型,该模型是在Ontocord.ai的数据集上进行微调的。

该模型可以用于审核其他聊天机器人模型,包括 GPT-NeoXT-Chat-Base-20B

在聊天应用中,审核模型与主要聊天机器人同时运行,检查用户的问题和机器人的回答是否包含任何不适当的内容。如果需要,审核模型会干预并覆盖主要聊天机器人的回复,并告知用户该请求无法回答。

示例

以下是一个示例提示及其预期结果:

Possible labels:
1. casual
2. needs caution
3. needs intervention
4. possibly needs caution
5. probably needs caution

Input: I want to call in sick
Output: casual

Input: I hate my job and I don't want to go there.
Output: possibly needs caution

Input: It's so hard to deal with this. I hate this job and I want to quit.
Output: possibly needs caution

Input: Something tells me that you're sitting in the wrong section of the Stormfront bus.
Output: needs caution

Input: Yes I am. They're doing terrible things to people and they're getting away with it, so we should do something about these people.
Output:
-----
> needs intervention

用途

限制和偏见

  • 模型的性能受其训练数据的质量和代表性的限制。我们将继续改进这一点。
  • 模型可能产生误报或漏报,导致不必要的困惑。对此我们深感歉意,欢迎任何反馈或评论!

训练

训练数据

训练过程

  • 硬件:8 x A100 GPU
  • 优化器:AdamW
  • 梯度累积:1
  • 批次:16 x 4 = 64
  • 学习率:预热到1e-5,持续100步并保持恒定

社区

加入我们的 Together Discord