StableLM-Tuned-Alpha

模型描述

StableLM-Tuned-Alpha 是一组基于 StableLM-Base-Alpha 模型构建的3B和7B参数的仅解码语言模型，进一步在各种聊天和指导遵循数据集上进行了微调。

用法

使用以下代码片段开始与 StableLM-Tuned-Alpha 进行对话：

from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList

tokenizer = AutoTokenizer.from_pretrained("StabilityAI/stablelm-tuned-alpha-7b")
model = AutoModelForCausalLM.from_pretrained("StabilityAI/stablelm-tuned-alpha-7b")
model.half().cuda()

class StopOnTokens(StoppingCriteria):
    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
        stop_ids = [50278, 50279, 50277, 1, 0]
        for stop_id in stop_ids:
            if input_ids[0][-1] == stop_id:
                return True
        return False

system_prompt = """<|SYSTEM|># StableLM Tuned (Alpha version)
- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
- StableLM will refuse to participate in anything that could harm a human.
"""

prompt = f"{system_prompt}<|USER|>What's your mood today?<|ASSISTANT|>"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
tokens = model.generate(
  **inputs,
  max_new_tokens=64,
  temperature=0.7,
  do_sample=True,
  stopping_criteria=StoppingCriteriaList([StopOnTokens()])
)
print(tokenizer.decode(tokens[0], skip_special_tokens=True))

<|SYSTEM|># StableLM Tuned (Alpha version)
- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
- StableLM will refuse to participate in anything that could harm a human.

模型细节

开发者 : Stability AI
模型类型 : StableLM-Tuned-Alpha 模型是基于NeoX Transformer架构的自回归语言模型。
语言 : 英语
库 : HuggingFace Transformers
许可证 : Fine-tuned checkpoints ( StableLM-Tuned-Alpha ) 根据非商业创作共用许可证 ( CC BY-NC-SA-4.0 ) 授权，与 Stanford Alpaca 指定的原始非商业许可证一致。
联系方式 : 有关模型的问题和评论，请发送电子邮件至 lm@stability.ai

训练

Parameters	Hidden Size	Layers	Heads	Sequence Length
3B	4096	16	32	4096
7B	6144	16	48	4096

训练数据集

StableLM-Tuned-Alpha 模型在五个数据集的组合上进行了微调： Alpaca ，一个包含由OpenAI的 text-davinci-003 引擎生成的52,000个指令和演示的数据集。 GPT4All Prompt Generations ，由GPT-4生成的40万个提示和响应组成； Anthropic HH ，包含对AI助手的有益和无害偏好的数据集； DataBricks Dolly ，由Databricks员工在InstructGPT论文中生成的15k个指令/响应组成，包括头脑风暴、分类、闭合QA、生成、信息提取、开放QA和总结；以及 ShareGPT Vicuna (English subset) ，一个从 ShareGPT 中检索的对话数据集。

训练过程

模型通过在上述数据集上进行监督微调进行学习，使用混合精度（FP16）进行训练，并使用AdamW进行优化。我们概述以下超参数：

Parameters	Batch Size	Learning Rate	Warm-up	Weight Decay	Betas
3B	256	2e-5	50	0.01	(0.9, 0.99)
7B	128	2e-5	100	0.01	(0.9, 0.99)

使用与限制

预期用途

这些模型旨在与 CC BY-NC-SA-4.0 许可证一致，由开源社区用于类似聊天的应用程序。

限制和偏见

尽管上述数据集有助于将基本语言模型导向更“安全”的文本分布，但并不能通过微调来减轻所有偏见和毒性。我们要求用户留意在生成的响应中可能出现的潜在问题。请勿将模型输出视为人类判断的替代品或真理的来源。请负责任地使用。

致谢

没有 Dakota Mahan ( @dmayhem93 ) 的帮助，这项工作将无法实现。

引用

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}

@misc{vicuna2023,
    title = {Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality},
    url = {https://vicuna.lmsys.org},
    author = {Chiang, Wei-Lin and Li, Zhuohan and Lin, Zi and Sheng, Ying and Wu, Zhanghao and Zhang, Hao and Zheng, Lianmin and Zhuang, Siyuan and Zhuang, Yonghao and Gonzalez, Joseph E. and Stoica, Ion and Xing, Eric P.},
    month = {March},
    year = {2023}
}

@misc{gpt4all,
  author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar},
  title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/nomic-ai/gpt4all}},
}

作者:

Stability AI

数据集大小:

29.57 GB