模型:

Salesforce/xgen-7b-8k-base

英文

XGen-7B-8K-Base

Salesforce AI Research发布的XGen系列(7B)家族的官方研究版本:

标题: Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length

作者: Erik Nijkamp *,Tian Xie*, Hiroaki Hayashi *, Bo Pang *,Congying Xia*,Chen Xing,Jesse Vig,Semih Yavuz,Philippe Laban,Ben Krause,Senthil Purushwalkam,Tong Niu,Wojciech Kryscinski,Lidiya Murakhovs'ka,Prafulla Kumar Choubey,Alex Fabbri,Ye Liu,Rui Meng,Lifu Tu,Meghana Bhat, Chien-Sheng Wu ,Silvio Savarese, Yingbo Zhou Shafiq Rayhan Joty Caiming Xiong

(*表示贡献相等)

联系人:Shafiq Rayhan Joty,Caiming Xiong

模型

基础模型

  • XGen-7B-4K-Base :在4K序列长度下预训练的XGen-7B模型。
    • 许可证:Apache-2.0
  • XGen-7B-8K-Base :在8K序列长度下预训练的XGen-7B模型。
    • 许可证:Apache-2.0

指导微调模型

在公共领域教学数据上进行监督微调的模型。仅用于研究目的发布。

运行方法

模型的训练数据已使用OpenAI Tiktoken库进行标记化处理。要使用此模型,请通过pip安装该软件包:

pip install tiktoken

模型可以按如下方式用作自回归采样器:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Salesforce/xgen-7b-8k-base", torch_dtype=torch.bfloat16)
inputs = tokenizer("The world is", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))

引用

@misc{XGen,
  title={Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length},
  author={Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong},
  howpublished={Salesforce AI Research Blog},
  year={2023},
  url={https://blog.salesforceairesearch.com/xgen}
}