模型:
stabilityai/stablelm-tuned-alpha-7b
任务:
文本生成数据集:
dmayhem93/ChatCombined tatsu-lab/alpaca nomic-ai/gpt4all_prompt_generations Dahoas/full-hh-rlhf jeffwan/sharegpt_vicuna HuggingFaceH4/databricks_dolly_15k 3AHuggingFaceH4/databricks_dolly_15k 3Ajeffwan/sharegpt_vicuna 3ADahoas/full-hh-rlhf 3Anomic-ai/gpt4all_prompt_generations 3Atatsu-lab/alpaca 3Admayhem93/ChatCombined语言:
en许可:
cc-by-nc-sa-4.0StableLM-Tuned-Alpha 是一组基于 StableLM-Base-Alpha 模型构建的3B和7B参数的仅解码语言模型,进一步在各种聊天和指导遵循数据集上进行了微调。
使用以下代码片段开始与 StableLM-Tuned-Alpha 进行对话:
from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList tokenizer = AutoTokenizer.from_pretrained("StabilityAI/stablelm-tuned-alpha-7b") model = AutoModelForCausalLM.from_pretrained("StabilityAI/stablelm-tuned-alpha-7b") model.half().cuda() class StopOnTokens(StoppingCriteria): def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool: stop_ids = [50278, 50279, 50277, 1, 0] for stop_id in stop_ids: if input_ids[0][-1] == stop_id: return True return False system_prompt = """<|SYSTEM|># StableLM Tuned (Alpha version) - StableLM is a helpful and harmless open-source AI language model developed by StabilityAI. - StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user. - StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes. - StableLM will refuse to participate in anything that could harm a human. """ prompt = f"{system_prompt}<|USER|>What's your mood today?<|ASSISTANT|>" inputs = tokenizer(prompt, return_tensors="pt").to("cuda") tokens = model.generate( **inputs, max_new_tokens=64, temperature=0.7, do_sample=True, stopping_criteria=StoppingCriteriaList([StopOnTokens()]) ) print(tokenizer.decode(tokens[0], skip_special_tokens=True))
StableLM Tuned 应该使用格式为 <|SYSTEM|>...<|USER|>...<|ASSISTANT|>... 的提示信息。系统提示是
<|SYSTEM|># StableLM Tuned (Alpha version) - StableLM is a helpful and harmless open-source AI language model developed by StabilityAI. - StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user. - StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes. - StableLM will refuse to participate in anything that could harm a human.
Parameters | Hidden Size | Layers | Heads | Sequence Length |
---|---|---|---|---|
3B | 4096 | 16 | 32 | 4096 |
7B | 6144 | 16 | 48 | 4096 |
StableLM-Tuned-Alpha 模型在五个数据集的组合上进行了微调: Alpaca ,一个包含由OpenAI的 text-davinci-003 引擎生成的52,000个指令和演示的数据集。 GPT4All Prompt Generations ,由GPT-4生成的40万个提示和响应组成; Anthropic HH ,包含对AI助手的有益和无害偏好的数据集; DataBricks Dolly ,由Databricks员工在InstructGPT论文中生成的15k个指令/响应组成,包括头脑风暴、分类、闭合QA、生成、信息提取、开放QA和总结;以及 ShareGPT Vicuna (English subset) ,一个从 ShareGPT 中检索的对话数据集。
模型通过在上述数据集上进行监督微调进行学习,使用混合精度(FP16)进行训练,并使用AdamW进行优化。我们概述以下超参数:
Parameters | Batch Size | Learning Rate | Warm-up | Weight Decay | Betas |
---|---|---|---|---|---|
3B | 256 | 2e-5 | 50 | 0.01 | (0.9, 0.99) |
7B | 128 | 2e-5 | 100 | 0.01 | (0.9, 0.99) |
这些模型旨在与 CC BY-NC-SA-4.0 许可证一致,由开源社区用于类似聊天的应用程序。
尽管上述数据集有助于将基本语言模型导向更“安全”的文本分布,但并不能通过微调来减轻所有偏见和毒性。我们要求用户留意在生成的响应中可能出现的潜在问题。请勿将模型输出视为人类判断的替代品或真理的来源。请负责任地使用。
没有 Dakota Mahan ( @dmayhem93 ) 的帮助,这项工作将无法实现。
@misc{alpaca, author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto }, title = {Stanford Alpaca: An Instruction-following LLaMA model}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}}, }
@misc{vicuna2023, title = {Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality}, url = {https://vicuna.lmsys.org}, author = {Chiang, Wei-Lin and Li, Zhuohan and Lin, Zi and Sheng, Ying and Wu, Zhanghao and Zhang, Hao and Zheng, Lianmin and Zhuang, Siyuan and Zhuang, Yonghao and Gonzalez, Joseph E. and Stoica, Ion and Xing, Eric P.}, month = {March}, year = {2023} }
@misc{gpt4all, author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar}, title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/nomic-ai/gpt4all}}, }