模型:
Salesforce/xgen-7b-8k-inst
Salesforce AI Research发布的XGen系列模型(7B系列)的官方研究版本:
标题: Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length
作者: Erik Nijkamp *,谢天*, Hiroaki Hayashi *, Bo Pang *,夏聪颖*,陈星,Jesse Vig,Semih Yavuz,Philippe Laban,Ben Krause,Senthil Purushwalkam,Tong Niu,Wojciech Kryscinski,Lidiya Murakhovs'ka,Prafulla Kumar Choubey,Alex Fabbri,刘烨,孟睿,屠理夫,Meghana Bhat, Chien-Sheng Wu ,Silvio Savarese, Yingbo Zhou , Shafiq Rayhan Joty , Caiming Xiong 。
(*表示平等贡献)
通信:Shafiq Rayhan Joty,Caiming Xiong
基于公共领域指令数据进行的监督微调模型。仅供研究目的发布。
使用OpenAI Tiktoken库对模型的训练数据进行了分词。
要使用此模型,请通过pip安装该软件包:
pip install tiktoken
模型可以按照以下方式用作自回归取样器:
import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-inst", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("Salesforce/xgen-7b-8k-inst", torch_dtype=torch.bfloat16) header = ( "A chat between a curious human and an artificial intelligence assistant. " "The assistant gives helpful, detailed, and polite answers to the human's questions.\n\n" ) article = "" # insert a document here prompt = f"### Human: Please summarize the following article.\n\n{article}.\n###" inputs = tokenizer(header + prompt, return_tensors="pt") sample = model.generate(**inputs, do_sample=True, max_new_tokens=2048, top_k=100, eos_token_id=50256) output = tokenizer.decode(sample[0]) print(output.strip().replace("Assistant:", ""))
@misc{XGen, title={Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length}, author={Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong}, howpublished={Salesforce AI Research Blog}, year={2023}, url={https://blog.salesforceairesearch.com/xgen} }