模型:
togethercomputer/RedPajama-INCITE-7B-Base
RedPajama-INCITE-7B-Base 是由Together和开源AI社区的领导者共同开发的,包括Ontocord.ai、ETH DS3Lab、AAI CERC、蒙特利尔大学、MILA - 魁北克AI研究所、斯坦福大学基础模型研究中心(CRFM)、斯坦福大学Hazy研究小组和LAION。该模型的训练是在INCITE 2023项目的支持下进行的,该项目为可扩展的通用AI基础模型提供了3,072个V100 GPU。该项目由MILA、LAION和EleutherAI于2022年秋季获得,得到了奥克里奇领导计算设施(OLCF)和INCITE计划的支持。
请注意,此模型需要 transformers 版本>= 4.25.1。
需要具有16GB内存的GPU。
import torch import transformers from transformers import AutoTokenizer, AutoModelForCausalLM MIN_TRANSFORMERS_VERSION = '4.25.1' # check transformers version assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.' # init tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Base") model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Base", torch_dtype=torch.float16) model = model.to('cuda:0') # infer prompt = "Alan Turing is" inputs = tokenizer(prompt, return_tensors='pt').to(model.device) input_length = inputs.input_ids.shape[1] outputs = model.generate( **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True ) token = outputs.sequences[0, input_length:] output_str = tokenizer.decode(token) print(output_str) """ widely considered to be the father of modern computer science and artificial intelligence. He was a brilliant mathematician and cryptographer, who worked for the British government during World War II. He was instrumental in breaking the German Enigma code, and is credited with helping to shorten the war by two years... """
需要具有12GB内存的GPU。
要在Int8模式下运行推理,请确保您已安装了accelerate和bitandbytes。您可以使用以下命令安装它们:
pip install accelerate pip install bitsandbytes
然后你可以按照以下方式使用Int8运行推理:
import torch import transformers from transformers import AutoTokenizer, AutoModelForCausalLM MIN_TRANSFORMERS_VERSION = '4.25.1' # check transformers version assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.' # init tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Base") model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Base", device_map='auto', torch_dtype=torch.float16, load_in_8bit=True) # infer prompt = "Alan Turing is" inputs = tokenizer(prompt, return_tensors='pt').to(model.device) input_length = inputs.input_ids.shape[1] outputs = model.generate( **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True ) token = outputs.sequences[0, input_length:] output_str = tokenizer.decode(token) print(output_str) """ a very well-known name in the world of computer science. It is named after the mathematician Alan Turing. He is famous for his work on the Enigma machine, which was used by the Germans during World War II.... """``` ## CPU Inference ```python import torch import transformers from transformers import AutoTokenizer, AutoModelForCausalLM MIN_TRANSFORMERS_VERSION = '4.25.1' # check transformers version assert transformers.__version__ >= MIN_TRANSFORMERS_VERSION, f'Please upgrade transformers to version {MIN_TRANSFORMERS_VERSION} or higher.' # init tokenizer = AutoTokenizer.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Base") model = AutoModelForCausalLM.from_pretrained("togethercomputer/RedPajama-INCITE-7B-Base", torch_dtype=torch.bfloat16) # infer prompt = "Alan Turing is" inputs = tokenizer(prompt, return_tensors='pt').to(model.device) input_length = inputs.input_ids.shape[1] outputs = model.generate( **inputs, max_new_tokens=128, do_sample=True, temperature=0.7, top_p=0.7, top_k=50, return_dict_in_generate=True ) token = outputs.sequences[0, input_length:] output_str = tokenizer.decode(token) print(output_str) """ one of the most important figures in the history of computing. He is best known for his work on the development of the modern computer and for his code-breaking work during World War II. He was also a brilliant mathematician and philosopher. """
请注意,由于LayerNormKernelImpl在CPU上没有实现fp16,我们在CPU推理中使用bfloat16。
以下是不适合使用的情况。
使用者有责任确保以负责任和道德的方式使用该模型。
超出范围的使用RedPajama-INCITE-7B-Base 是一种语言模型,可能在其预定范围之外的其他用例中表现不佳。例如,它可能不适用于安全关键应用程序或对个人或社会产生重大影响的决策。重要的是要考虑模型的局限性,只在其预期目的下使用它。
不当使用和恶意使用RedPajama-INCITE-7B-Base 是为语言建模而设计的。严禁滥用该模型,例如用于从事非法或不道德活动,这违反了该项目的原则。
为了避免滥用该模型,以下是禁止的用途:
RedPajama-INCITE-7B-Base 和其他语言模型一样,存在一些限制需要考虑。例如,该模型可能无法始终提供准确或相关的答案,特别是对于复杂、模糊或超出其训练数据范围的问题。因此,我们欢迎个人和组织的贡献,并鼓励合作,共同打造更强大、更包容的聊天机器人。
训练数据
请参考 togethercomputer/RedPajama-Data-1T 。
训练过程
请参考我们的 blog post 以获取基准测试结果。
我们提供了11个已发布供研究使用的中间检查点。这些检查点基于它们所包含的标记数量进行组织,范围从2400亿标记到1万亿标记。
加入我们在 Together Discord 上。