模型:
togethercomputer/RedPajama-INCITE-Chat-3B-v1
RedPajama-INCITE-Chat-3B-v1是由Together和开源AI社区的领导者共同开发的,包括Ontocord.ai、ETH DS3Lab、AAI CERC、蒙特利尔大学、MILA - 魁北克人工智能研究所、斯坦福大学基础模型研究中心(CRFM)、斯坦福大学Hazy研究小组和LAION。
它在OASST1和Dolly2上进行了微调,以增强聊天能力。
请注意,该模型需要transformers版本>= 4.25.1。
要提示聊天模型,请使用以下格式:
<human>: [Instruction] <bot>:
这需要至少8GB显存的GPU。
import torch import transformers from transformers import AutoTokenizer, AutoModelForCausalLM MIN_TRANSFORMERS_VERSION = '4.25.1' # check transformers version assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.' # init tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-Chat-3B-v1") model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-Chat-3B-v1", torch_dtype=torch.float16) model = model.to('cuda:0') # infer prompt = "<human>: Who is Alan Turing?\n<bot>:" inputs = tokenizer(prompt, return_tensors='pt').to(model.device) input_length = inputs.input_ids.shape[1] outputs = model.generate( **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True ) token = outputs.sequences[0, input_length:] output_str = tokenizer.decode(token) print(output_str) """ Alan Turing was a British mathematician, logician, cryptologist, and computer scientist. He is widely regarded as the father of computer science and artificial intelligence. """
这需要至少6GB显存的GPU。
要进行Int8推理,请确保已安装accelerate和bitandbytes。您可以使用以下命令安装它们:
pip install accelerate pip install bitsandbytes
然后,您可以按如下方式运行Int8推理:
import torch import transformers from transformers import AutoTokenizer, AutoModelForCausalLM MIN_TRANSFORMERS_VERSION = '4.25.1' # check transformers version assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.' # init tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-Chat-3B-v1") model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-Chat-3B-v1", device_map='auto', torch_dtype=torch.float16, load_in_8bit=True) # infer prompt = "<human>: Who is Alan Turing?\n<bot>:" inputs = tokenizer(prompt, return_tensors='pt').to(model.device) input_length = inputs.input_ids.shape[1] outputs = model.generate( **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True ) token = outputs.sequences[0, input_length:] output_str = tokenizer.decode(token) print(output_str) """ Alan Turing was a British mathematician and computer scientist who made important contributions to computer science and mathematical logic. He is widely regarded as the father of computer science and artificial intelligence for his work on the Turing machine and Turing test. """
import torch import transformers from transformers import AutoTokenizer, AutoModelForCausalLM MIN_TRANSFORMERS_VERSION = '4.25.1' # check transformers version assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.' # init tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-Chat-3B-v1") model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-Chat-3B-v1", torch_dtype=torch.bfloat16) # infer prompt = "<human>: Who is Alan Turing?\n<bot>:" inputs = tokenizer(prompt, return_tensors='pt').to(model.device) input_length = inputs.input_ids.shape[1] outputs = model.generate( **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True ) token = outputs.sequences[0, input_length:] output_str = tokenizer.decode(token) print(output_str) """ Alan Turing was a British mathematician and computer scientist who made important contributions to the fields of mathematics, cryptography, and computer science. He is widely regarded as the father of computer science and artificial intelligence. """
请注意,由于LayerNormKernelImpl未在CPU上的fp16中实现,我们在CPU推理中使用bfloat16。
排除的使用情况如下所述。
用户有责任确保以负责任和道德的方式使用模型。
超出范围的使用RedPajama-INCITE-Chat-3B-v1是一个语言模型,其在既定范围之外的其他用例中可能表现不佳。例如,它可能不适用于在安全关键应用程序中使用或对个人或社会产生重大影响的决策。重要的是要考虑模型的限制,并仅将其用于其预期目的。
不当使用和恶意使用RedPajama-INCITE-Chat-3B-v1旨在用于语言建模。严禁滥用模型,例如使用它参与非法或不道德的活动,这与该项目的原则相违背。
滥用此模型包括但不限于以下行为:
RedPajama-INCITE-Chat-3B-v1和其他语言模型一样,有其限制,应予以考虑。例如,该模型可能无法始终提供准确或相关的答案,特别是对于复杂、模糊或超出其训练数据范围的问题。因此,我们欢迎个人和组织的贡献,并鼓励合作,共同创建一个更加强大和包容的聊天机器人。
训练数据
请参考 togethercomputer/RedPajama-Data-1T
训练过程
加入我们 Together Discord