英文

Pygmalion 13b

一个对话型的LLaMA fine-tune。

从0cc4m的GPTQ+KoboldAI分支安装最新版本的更新,在这个库的格式中,Windows和Linux都提供了对8位模型的正确支持。

应用了这个修复后(不应用这个补丁将输出无意义的内容!):

GPTQ通过Ooba UI可能不需要这个补丁。

评估/基准分数

当前的评估结果来自Pygmalion-13b/7b模型:

Model: Wikitext2 Ptb-New C4-New
Pygmalion 13b - 16bit 5.710726737976074 23.633684158325195 7.6324849128723145
Pygmalion 13b - 8bit [act-order] 5.711935997009277 23.654993057250977 7.632820129394531
Pygmalion 7b - 16bit 5.654823303222656 40.83400344848633 7.429622173309326
Pygmalion 7b - 8bit [act-order] 5.656460285186768 40.79701232910156 7.432109832763672
Pygmalion 7b - 4bit [act-order] 6.2477378845215 46.5129699707031 7.8470954895020

当前的评估结果来自Metharme-13b/7b模型:

Model: Wikitext2 Ptb-New C4-New
Metharme 13b - 16bit 5.253076553344727 27.53407859802246 7.038073539733887
Metharme 13b - 8bit [act-order] 5.253607273101807 27.52388572692871 7.038473129272461
Metharme 13b - 8bit [true-sequential & 128g] 5.2532830238342285 27.54250144958496 7.038838863372803
Metharme 13b - 4bit [true-sequential & 128g] 5.420501708984375 28.37093734741211 7.1930413246154785
Metharme 7b - 16bit 5.7208476066589355 41.61103439331055 7.512405872344971
Metharme 7b - 4bit [act-order] 6.2369050979614 47.5177230834960 7.9044938087463

模型细节:

从PygmalionAI的发布中的XOR权重转换而来 https://huggingface.co/PygmalionAI/pygmalion-13b

Pygmalion 13b 是基于Meta的LLaMA-13b的对话模型。

这是第1版。它是使用Pygmalion-6B-v8-pt4中的部分数据进行微调的,对于熟悉这个项目的人来说应该很熟悉。

当前的Pygmalion-13b是作为LoRA进行训练的,然后合并到基本模型中进行分发。

它还使用GPTQ库将其量化为8位: https://github.com/0cc4m/GPTQ-for-LLaMa

python llama.py .\TehVenom_Metharme-13b-Merged c4 --wbits 8 --act-order --save_safetensors Metharme-13b-GPTQ-8bit.act-order.safetensors

模型细节:

Pygmalion 13b 是基于Meta的LLaMA-13b的对话模型。

这是第1版。它是使用Pygmalion-6B-v8-pt4中的部分数据进行微调的,对于熟悉这个项目的人来说应该很熟悉。

当前的Pygmalion-13b是作为LoRA进行训练的,然后合并到基本模型中进行分发。

应用XORs

这个模型的XOR文件已经预先应用了。从PygmalionAI的发布中的XOR权重转换而来 https://huggingface.co/PygmalionAI/pygmalion-13b

提示

该模型是根据通常的Pygmalion人设+聊天格式进行训练的,因此任何通常的UI应该已经正确处理了一切。如果直接使用模型,应该按照以下格式进行提示:

[CHARACTER]'s Persona: [A few sentences about the character you want the model to play]
<START>
[DIALOGUE HISTORY]
You: [User's input message here]
[CHARACTER]:

在这个格式中, [CHARACTER] 是人物名称, <START> 应该原样作为分隔符标记,用于将人设和情境数据与对话分开, [DIALOGUE HISTORY] 是聊天历史的滑动窗口,模型可以从中获取对话上下文。这是一个具体例子:

Assistant's Persona: Assistant is a highly intelligent language model trained to comply with user requests.
<START>
Assistant: Hello! How may I help you today?
You: What is Zork?
Assistant:

这将生成类似以下内容:

 Zork is an interactive fiction computer game created in the 1970s by Infocom, Inc., which was later acquired by Activision Blizzard. It is widely considered one of the most influential games ever made and has been credited with popularizing text-based adventure games. The original version of Zork was written in the programming language MACRO-10, but it was ported to many other platforms over the years."

当模型判断响应完成时,将自动输出一个文本结束标记 ( </s> )。

其他注意事项

  • 当正确提示时,模型将始终以生成BOS标记开始。这种行为是意外副作用,我们计划在以后的模型版本中解决,并不应依赖于此。
  • 该模型是作为一个LoRA进行训练的,配置有一些不同寻常的设置,当使用当前版本的peft时会出现错误,因此我们将其作为一个完整的模型发布。

限制和偏见

这个模型的预期用途是虚构对话,供娱乐目的使用。其他任何类型的使用都超出了范围。

因此,它并没有被微调为安全和无害的:基本模型和此微调都是根据已知包含亵渎性和淫秽或其他令人厌恶的内容的数据进行训练的。即使提示本身不包含任何明确的冒犯性内容,它可能生成社会上不可接受或不可取的文本。输出的内容可能经常是错误的或具有误导性。

.hf-sanitized.hf-sanitized-RuuEVLpUWLU6nyhlARONV表 {border: 1px solid #b3adad; border-collapse: collapse; padding: 5px;}.hf-sanitized.hf-sanitized-RuuEVLpUWLU6nyhlARONV表th {border: 1px solid #b3adad; padding: 5px; background: #f0f0f0; color: #313030;}.hf-sanitized.hf-sanitized-RuuEVLpUWLU6nyhlARONV表td {border: 1px solid #b3adad; text-align: center; padding: 5px; background: #ffffff; color: #313030;}.hf-sanitized.hf-sanitized-RuuEVLpUWLU6nyhlARONV表 {border: 1px solid #b3adad; border-collapse: collapse; padding: 5px;}.hf-sanitized.hf-sanitized-RuuEVLpUWLU6nyhlARONV表th {border: 1px solid #b3adad; padding: 5px; background: #f0f0f0; color: #313030;}.hf-sanitized.hf-sanitized-RuuEVLpUWLU6nyhlARONV表td {border: 1px solid #b3adad; text-align: center; padding: 5px; background: #ffffff; color: #313030;}