模型:

TheBloke/MPT-7B-Instruct-GGML

类库:

Transformers

数据集:

mosaicml/dolly_hhrlhf 3Amosaicml/dolly_hhrlhf

其他:

mpt Composer MosaicML llm-foundry

预印本库:

arxiv:2205.14135 arxiv:2108.12409 arxiv:2010.04245

许可:

cc-by-sa-3.0

模型介绍文件清单

英文

Chat & support: my new Discord server

Want to contribute? TheBloke's Patreon page

MPT-7B-Instruct GGML

这是 MosaicML's MPT-7B-Instruct 的GGML格式量化的4位、5位和8位GGML模型。

该存储库是将其转换为GGML并量化的结果。

请注意，这些MPT GGML文件与llama.cpp不兼容。请参阅下面列出的已知与这些模型文件配合使用的工具列表。

可用的存储库

提供的文件

Name	Quant method	Bits	Size	RAM required	Use case
mpt7b-instruct.ggmlv3.q4_0.bin	q4_0	4bit	4.16GB	6.2GB	4-bit.
mpt7b-instruct.ggmlv3.q4_1.bin	q4_0	4bit	4.99GB	7.2GB	4-bit. Higher accuracy than q4_0 but not as high as q5_0. However has quicker inference than q5 models.
mpt7b-instruct.ggmlv3.q5_0.bin	q5_0	5bit	4.57GB	6.8GB	5-bit. Higher accuracy, higher resource usage and slower inference.
mpt7b-instruct.ggmlv3.q5_1.bin	q5_1	5bit	4.99GB	7.2GB	5-bit. Even higher accuracy, and higher resource usage and slower inference.
mpt7b-instruct.ggmlv3.q8_0.bin	q8_0	8bit	7.48GB	9.7GB	8-bit. Almost indistinguishable from float16. Huge resource use and slow. Not recommended for normal use.
mpt7b-instruct.ggmlv3.fp16.bin	fp16	16bit	13.30GB	16GB	Full 16-bit.

兼容性

这些文件不与llama.cpp兼容。

目前，它们可以与以下项目一起使用：

KoboldCpp，一个基于llama.cpp的强大推理引擎，具有良好的用户界面： KoboldCpp
ctransformers Python库，包括LangChain支持： ctransformers
使用ctransformers的GPT4All-UI： GPT4All-UI
rustformers' llm
ggml 提供的示例mpt二进制文件

随着其他选项的出现，我将尽力在此处更新它们（如果我漏掉了什么，请在社区选项卡中告诉我！）

使用GPT4All-UI的教程

Discord

如需进一步支持以及关于这些模型和AI的讨论，请加入我们：

TheBloke AI's Discord server

感谢和如何贡献

感谢 chirper.ai 团队！

我已经有很多人问我是否可以贡献。我喜欢提供模型并帮助人们，很乐意能够花更多的时间来做这件事，以及扩大到像微调/训练等新项目。

如果您能够并且愿意做出贡献，我将非常感激，并且这将有助于我继续提供更多的模型，并开始进行新的AI项目的工作。

捐赠者将优先获得有关任何AI/LLM/模型问题和请求的支持，访问私人Discord聊天室以及其他好处。

Patreon： https://patreon.com/TheBlokeAI
Ko-Fi： https://ko-fi.com/TheBlokeAI

Patreon特别致谢：Aemon Algiz, Dmitriy Samsonov, Nathan LeClaire, Trenton Dambrowitz, Mano Prime, David Flickinger, vamX, Nikolai Manek, senxiiz, Khalefa Al-Ahmad, Illia Dulskyi, Jonathan Leane, Talal Aujan, V. Lukas, Joseph William Delisle, Pyrater, Oscar Rangel, Lone Striker, Luke Pendergrass, Eugene Pentland, Sebastain Graf, Johann-Peter Hartman.

感谢我所有慷慨的赞助人和捐赠者！

MPT-7B-Instruct

MPT-7B-Instruct是一种用于短格式指令的模型。它是通过在 MPT-7B 的 dataset 上优化进行训练得到的，该模型源自 Databricks Dolly-15k 和 Anthropic Helpful and Harmless (HH-RLHF) 数据集。

许可证： CC-By-SA-3.0
Demo on Hugging Face Spaces

该模型由 MosaicML 和MosaicML NLP团队进行微调，并遵循了修改后的仅解码器transformer架构。

模型日期

May 5, 2023

模型许可证

CC-By-SA-3.0

文档

Blog post: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs
Codebase (mosaicml/llm-foundry repo)
问题：请随时通过 MosaicML Community Slack 联系我们！

示例问题/指令

Longboi24 :

什么是大刺猬？

MPT-7B-Instruct :

大刺猬（发音为“cool”）是澳大利亚的一种本土肉食性有袋动物，也被称为其他地区的袋鼠或墨西哥袋鼠

使用方法

注意：该模型要求在from_pretrained方法中传递trust_remote_code=True。这是因为我们使用的是尚未包含在transformers包中的自定义模型架构。

它包括许多训练效率特性的选项，例如 FlashAttention (Dao et al. 2022) ， ALiBi ，QK LayerNorm等。

import transformers
model = transformers.AutoModelForCausalLM.from_pretrained(
  'mosaicml/mpt-7b-instruct',
  trust_remote_code=True
)

注意：该模型要求在from_pretrained方法中传递trust_remote_code=True。这是因为我们使用的是尚未包含在Hugging Face transformers包中的自定义MPT模型架构。MPT包括许多训练效率特性的选项，例如 FlashAttention ， ALiBi ， QK LayerNorm 等。

要使用优化后的FlashAttention的 triton implementation ，您可以使用attn_impl='triton'加载模型并将模型移动到bfloat16：

config = transformers.AutoConfig.from_pretrained(
  'mosaicml/mpt-7b-instruct',
  trust_remote_code=True
)
config.attn_config['attn_impl'] = 'triton'

model = transformers.AutoModelForCausalLM.from_pretrained(
  'mosaicml/mpt-7b-instruct',
  config=config,
  torch_dtype=torch.bfloat16,
  trust_remote_code=True
)
model.to(device='cuda:0')

尽管该模型在序列长度为2048的情况下进行了训练，但ALiBi允许用户在微调和/或推理过程中增加最大序列长度。例如：

config = transformers.AutoConfig.from_pretrained(
  'mosaicml/mpt-7b-instruct',
  trust_remote_code=True
)
config.update({"max_seq_len": 4096})
model = transformers.AutoModelForCausalLM.from_pretrained(
  'mosaicml/mpt-7b-instruct',
  config=config,
  trust_remote_code=True
)

该模型使用了 EleutherAI/gpt-neox-20b 的tokenizer。

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")

模型描述

该架构是标准解码器式transformer的改进。

该模型相对于标准transformer进行了以下修改：

它使用了 FlashAttention
它使用了 ALiBi (Attention with Linear Biases) ，并且不使用位置嵌入
它不使用偏差

Hyperparameter	Value
n_parameters	6.7B
n_layers	32
n_heads	32
d_model	4096
vocab size	50432
sequence length	2048

预训练数据

有关预训练过程的更多细节，请参阅 MPT-7B 。

数据使用 EleutherAI/gpt-neox-20b 的tokenizer进行了标记化。

限制和偏见

以下语言经过了从 EleutherAI's GPT-NeoX-20B 修改。

MPT-7B-Instruct可能会产生事实不准确的输出，不应依靠它产生准确的事实信息。MPT-7B-Instruct是根据各种公共数据集进行训练的，尽管对预训练数据进行了大量清理工作，但这个模型可能会生成淫秽、有偏见或其他冒犯性的输出。

致谢

该模型由Sam Havens和MosaicML NLP团队进行微调。

MosaicML平台

如果您对在MosaicML平台上 training 和 deploying 您自己的MPT或LLMs感兴趣，请 sign up here 。

免责声明

本模型的许可证不构成法律建议。我们对使用该模型的第三方的行为不负责任。在商业用途之前，请咨询律师。

引用

请使用以下格式引用该模型：

@online{MosaicML2023Introducing,
    author    = {MosaicML NLP Team},
    title     = {Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs},
    year      = {2023},
    url       = {www.mosaicml.com/blog/mpt-7b},
    note      = {Accessed: 2023-03-28}, % change this date
    urldate   = {2023-03-28} % change this date
}

作者:

Tom Jobbins

数据集大小:

35.23 GB