模型:
Salesforce/xgen-7b-4k-base
Salesforce AI Research发布了XGen系列(7B)模型的官方研究版本:
标题: Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length
作者: Erik Nijkamp *,Tian Xie*, Hiroaki Hayashi *, Bo Pang *,Congying Xia*,Chen Xing,Jesse Vig,Semih Yavuz,Philippe Laban,Ben Krause,Senthil Purushwalkam,Tong Niu,Wojciech Kryscinski,Lidiya Murakhovs'ka,Prafulla Kumar Choubey,Alex Fabbri,Ye Liu,Rui Meng,Lifu Tu,Meghana Bhat, Chien-Sheng Wu ,Silvio Savarese, Yingbo Zhou , Shafiq Rayhan Joty , Caiming Xiong 。
(*表示贡献相同)
通信联系人:Shafiq Rayhan Joty,Caiming Xiong
在公共域指导数据上进行监督微调的模型,仅供研究目的发布。
模型的训练数据已使用OpenAI Tiktoken库进行了标记化处理。要使用这个模型,请通过pip安装该包:
pip install tiktoken
这些模型可以用作自回归采样器,如下所示:
import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-4k-base", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("Salesforce/xgen-7b-4k-base", torch_dtype=torch.bfloat16) inputs = tokenizer("The world is", return_tensors="pt") sample = model.generate(**inputs, max_length=128) print(tokenizer.decode(sample[0]))
@misc{XGen, title={Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length}, author={Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong}, howpublished={Salesforce AI Research Blog}, year={2023}, url={https://blog.salesforceairesearch.com/xgen} }