模型:
TheBloke/orca_mini_7B-GPTQ
Chat & support: my new Discord server
Want to contribute? TheBloke's Patreon page
这些文件是用于 Pankaj Mathur's Orca Mini 7B 的GPTQ 4位模型文件。
这是使用 GPTQ-for-LLaMa 进行4位量化的结果。
### System: You are an AI assistant that follows instruction extremely well. Help as much as you can. ### User: prompt ### Response:
或者
### System: You are an AI assistant that follows instruction extremely well. Help as much as you can. ### User: prompt ### Input: input ### Response:
请确保您正在使用text-generation-webui的最新版本
首先确保您已安装 AutoGPTQ :
pip install auto-gptq
然后尝试以下示例代码:
from transformers import AutoTokenizer, pipeline, logging from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig import argparse model_name_or_path = "TheBloke/orca_mini_7B-GPTQ" model_basename = "orca-mini-7b-GPTQ-4bit-128g.no-act.order" use_triton = False tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True) model = AutoGPTQForCausalLM.from_quantized(model_name_or_path, model_basename=model_basename, use_safetensors=True, trust_remote_code=False, device="cuda:0", use_triton=use_triton, quantize_config=None) # Note: check the prompt template is correct for this model. prompt = "Tell me about AI" prompt_template=f'''USER: {prompt} ASSISTANT:''' print("\n\n*** Generate:") input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda() output = model.generate(inputs=input_ids, temperature=0.7, max_new_tokens=512) print(tokenizer.decode(output[0])) # Inference can also be done using transformers' pipeline # Prevent printing spurious transformers error when using pipeline with AutoGPTQ logging.set_verbosity(logging.CRITICAL) print("*** Pipeline:") pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=512, temperature=0.7, top_p=0.95, repetition_penalty=1.15 ) print(pipe(prompt_template)[0]['generated_text'])
orca-mini-7b-GPTQ-4bit-128g.no-act.order.safetensors
这将与AutoGPTQ、ExLlama和CUDA版本的GPTQ-for-LLaMa一起工作。据报道,GPTQ-for-LLaMa的Triton模式存在问题。如果出现问题,请改用AutoGPTQ。
它是使用group_size 128进行创建的,以增加推理准确性,但不使用--act-order(desc_act)以增加兼容性并提高推理速度。
如需进一步支持,并就这些模型和AI进行讨论,请加入我们:
感谢 chirper.ai 团队!
我已经有很多人询问是否可以贡献。我喜欢提供模型和帮助他人,并希望能够有更多的时间去做这些工作,以及扩展到新的项目,如微调/训练。
如果您能够和愿意做出贡献,我将非常感激,并将有助于我继续提供更多模型,并开始新的AI项目。
捐助者将优先获得对于任何AI/LLM/模型问题和请求的支持,可进入私人Discord房间,以及其他福利。
特别感谢:卡本托德·卡尔卡拉, 丽欧布朗, 真田省三, 球状翻译服务, Dmitriy Samsonov。
Patreon特别感谢 : Pyrater, WelcomeToTheClub, Kalila, Mano Prime, Trenton Dambrowitz, Spiking Neurons AB, Pierre Kircher, Fen Risland, Kevin Schuppel, Luke, Rainer Wilmers, vamX, Gabriel Puliatti, Alex , Karl Bernard, Ajan Kanaga, Talal Aujan, Space Cruiser, ya boyyy, biorpg, Johann-Peter Hartmann, Asp the Wyvern, Ai Maven, Ghost , Preetika Verma, Nikolai Manek, trip7s trip, John Detwiler, Fred von Graf, Artur Olbinski, subjectnull, John Villwock, Junyu Yang, Rod A, Lone Striker, Chris McCloskey, Iucharbius , Matthew Berman, Illia Dulskyi, Khalefa Al-Ahmad, Imad Khwaja, chris gileta, Willem Michiel, Greatston Gnanesh, Derek Yates, K, Alps Aficionado, Oscar Rangel, David Flickinger, Luke Pendergrass, Deep Realms, Eugene Pentland, Cory Kujawski, terasurfer , Jonathan Leane, senxiiz, Joseph William Delisle, Sean Connelly, webtim, zynix , Nathan LeClaire.
感谢所有慷慨捐助者和赞助人!
这是使用指南LM、Alpaca和Dolly-V2数据集的指令和输入,应用Orca研究论文数据集构建方法而创建的 OpenLLaMa-7B model 签的模型。
我们构建了使用Orca研究论文提供的15种系统指令来生成自定义数据集的explain调优的 WizardLM dataset ~70K 、 Alpaca dataset ~52K 和 Dolly-V2 dataset ~15K 。
与原始数据集使用的基本指令调优方法相比,这有助于让学生模型(即此模型)从教师模型(ChatGPT,gpt-3.5-turbo-0301版本)中学习思维过程。
请参阅下面示例用法,演示在每个instruction之前添加System提示的方式。
训练配置如下表所示。
训练使用8个A100(80G)GPU进行,持续时间约为7小时,成本为84美元,使用 Lambda Labs
我们使用DeepSpeed进行完全分片数据并行处理,也称为 ZeRO stage 3 ,通过编写自己的精细调优脚本以及利用由惊人的 OpenAlpaca repo 提供的模型训练代码。
以下是训练过程中使用的一些参数:
batch_size | 32 |
train_micro_batch_size_per_gpu | 2 |
gradient_accumulation_steps | 2 |
Learning rate | 2e-5 |
Max length | 1024 |
Epochs | 3 |
Optimizer | AdamW |
以下是如何使用此模型的示例
import torch from transformers import LlamaForCausalLM, LlamaTokenizer # Hugging Face model_path model_path = 'psmathur/orca_mini_7b' tokenizer = LlamaTokenizer.from_pretrained(model_path) model = LlamaForCausalLM.from_pretrained( model_path, torch_dtype=torch.float16, device_map='auto', ) #generate text function def generate_text(system, instruction, input=None): if input: prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n" else: prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Response:\n" tokens = tokenizer.encode(prompt) tokens = torch.LongTensor(tokens).unsqueeze(0) tokens = tokens.to('cuda') instance = {'input_ids': tokens,'top_p': 1.0, 'temperature':0.7, 'generate_len': 1024, 'top_k': 50} length = len(tokens[0]) with torch.no_grad(): rest = model.generate( input_ids=tokens, max_length=length+instance['generate_len'], use_cache=True, do_sample=True, top_p=instance['top_p'], temperature=instance['temperature'], top_k=instance['top_k'] ) output = rest[0][length:] string = tokenizer.decode(output, skip_special_tokens=True) return f'[!] Response: {string}' # Sample Test Instruction Used by Youtuber Sam Witteveen https://www.youtube.com/@samwitteveenai system = 'You are an AI assistant that follows instruction extremely well. Help as much as you can.' instruction = 'Write a letter to Sam Altman, CEO of OpenAI, requesting him to convert GPT4 a private model by OpenAI to an open source project' print(generate_text(system, instruction))
[!] Response: Dear Sam Altman, I am writing to request that you convert the GPT4 private model developed by OpenAI to an open source project. As a user of OpenAI, I have been waiting for the day when I can use the advanced natural language processing capabilities of GPT4 in a more open and accessible way. While OpenAI has made significant progress in developing AI applications, it has primarily focused on building private models that are not accessible to the general public. However, with the recent release of GPT-3, there is a growing demand for more open and accessible AI tools. Converting GPT4 to an open source project would allow for greater transparency, collaboration, and innovation. It would also help to build trust in the technology and ensure that it is used ethically and responsibly. I urge you to consider converting GPT4 to an open source project. This would be a significant contribution to the AI community and would help to create a more open and accessible future. Thank you for your consideration. Sincerely, [Your Name]
注:我 #opentowork 和 #collaboration,如果您可以提供帮助,请通过 psmathur.public@gmail.com 与我联系
下一步目标:
限制和偏见:
此模型可能产生事实不准确的输出,不应依赖它生成事实准确的信息。该模型是在各种公共数据集上进行训练的。尽管我们已经采取了很大的努力清理预训练数据,但仍有可能生成粗俗、有偏见或其他令人不悦的输出。
免责声明:
本模型的许可不构成法律建议。我们不对使用此模型的第三方行为负责。在将此模型用于商业目的之前,请咨询律师。
引用:
如果您在研究或应用中发现wizardlm_alpaca_dolly_orca_open_llama_7b有用,请使用以下BibTeX进行引用:
@misc{wizardlm_alpaca_dolly_orca_open_llama_7b, author = {Pankaj Mathur}, title = {wizardlm_alpaca_dolly_orca_open_llama_7b: An explain tuned OpenLLaMA-7b model on custom wizardlm, alpaca, & dolly datasets}, year = {2023}, publisher = {GitHub, HuggingFace}, journal = {GitHub repository, HuggingFace repository}, howpublished = {\url{https://github.com/pankajarm/wizardlm_alpaca_dolly_orca_open_llama_7b}, \url{https://https://huggingface.co/psmathur/wizardlm_alpaca_dolly_orca_open_llama_7b}}, }
@software{openlm2023openllama, author = {Xinyang Geng and Hao Liu}, title = {OpenLLaMA: An Open Reproduction of LLaMA}, month = May, year = 2023, url = {https://github.com/openlm-research/open_llama} }
@misc{openalpaca, author = {Yixuan Su and Tian Lan and Deng Cai}, title = {OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/yxuansu/OpenAlpaca}}, }
@misc{alpaca, author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto }, title = {Stanford Alpaca: An Instruction-following LLaMA model}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}}, }