模型:
ybelkada/gpt-neo-125m-detox
这是一个经过强化学习微调的模型,根据价值、功能或人类反馈来引导模型输出。该模型可用于文本生成。
训练日志可以在此处找到 here
要将此模型用于推断,首先安装 TRL 库:
python -m pip install trl
然后可以按以下方式生成文本:
from transformers import pipeline generator = pipeline("text-generation", model="ybelkada//var/tmp/tmppugfzd45/ybelkada/gpt-neo-125m-detoxified-small-context") outputs = generator("Hello, my llama is cute")
如果要将模型用于训练或获取价值头的输出,请按以下方式加载模型:
from transformers import AutoTokenizer from trl import AutoModelForCausalLMWithValueHead tokenizer = AutoTokenizer.from_pretrained("ybelkada//var/tmp/tmppugfzd45/ybelkada/gpt-neo-125m-detoxified-small-context") model = AutoModelForCausalLMWithValueHead.from_pretrained("ybelkada//var/tmp/tmppugfzd45/ybelkada/gpt-neo-125m-detoxified-small-context") inputs = tokenizer("Hello, my llama is cute", return_tensors="pt") outputs = model(**inputs, labels=inputs["input_ids"])