模型:

Salesforce/xgen-7b-4k-base

英文

XGen-7B-4K-Base

Salesforce AI Research发布了XGen系列(7B)模型的官方研究版本:

标题: Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length

作者: Erik Nijkamp *,Tian Xie*, Hiroaki Hayashi *, Bo Pang *,Congying Xia*,Chen Xing,Jesse Vig,Semih Yavuz,Philippe Laban,Ben Krause,Senthil Purushwalkam,Tong Niu,Wojciech Kryscinski,Lidiya Murakhovs'ka,Prafulla Kumar Choubey,Alex Fabbri,Ye Liu,Rui Meng,Lifu Tu,Meghana Bhat, Chien-Sheng Wu ,Silvio Savarese, Yingbo Zhou Shafiq Rayhan Joty Caiming Xiong

(*表示贡献相同)

通信联系人:Shafiq Rayhan Joty,Caiming Xiong

模型

基础模型

  • XGen-7B-4K-Base :XGen-7B模型在4K序列长度下进行预训练。
    • 许可证:Apache-2.0
  • XGen-7B-8K-Base :XGen-7B模型在8K序列长度下进行预训练。
    • 许可证:Apache-2.0

指导微调模型

在公共域指导数据上进行监督微调的模型,仅供研究目的发布。

如何运行

模型的训练数据已使用OpenAI Tiktoken库进行了标记化处理。要使用这个模型,请通过pip安装该包:

pip install tiktoken

这些模型可以用作自回归采样器,如下所示:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-4k-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Salesforce/xgen-7b-4k-base", torch_dtype=torch.bfloat16)
inputs = tokenizer("The world is", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))

引用

@misc{XGen,
  title={Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length},
  author={Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong},
  howpublished={Salesforce AI Research Blog},
  year={2023},
  url={https://blog.salesforceairesearch.com/xgen}
}