模型:

Salesforce/xgen-7b-8k-inst

英文

XGen-7B-8K-Inst

Salesforce AI Research发布的XGen系列模型(7B系列)的官方研究版本:

标题: Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length

作者: Erik Nijkamp *,谢天*, Hiroaki Hayashi *, Bo Pang *,夏聪颖*,陈星,Jesse Vig,Semih Yavuz,Philippe Laban,Ben Krause,Senthil Purushwalkam,Tong Niu,Wojciech Kryscinski,Lidiya Murakhovs'ka,Prafulla Kumar Choubey,Alex Fabbri,刘烨,孟睿,屠理夫,Meghana Bhat, Chien-Sheng Wu ,Silvio Savarese, Yingbo Zhou Shafiq Rayhan Joty Caiming Xiong

(*表示平等贡献)

通信:Shafiq Rayhan Joty,Caiming Xiong

模型

基本模型

  • XGen-7B-4K-Base :在4K序列长度下预训练的XGen-7B模型。
    • 许可证:Apache-2.0
  • XGen-7B-8K-Base :在8K序列长度下预训练的XGen-7B模型。
    • 许可证:Apache-2.0

指令微调模型

基于公共领域指令数据进行的监督微调模型。仅供研究目的发布。

如何运行

使用OpenAI Tiktoken库对模型的训练数据进行了分词。

要使用此模型,请通过pip安装该软件包:

pip install tiktoken

模型可以按照以下方式用作自回归取样器:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-inst", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Salesforce/xgen-7b-8k-inst", torch_dtype=torch.bfloat16)

header = (
    "A chat between a curious human and an artificial intelligence assistant. "
    "The assistant gives helpful, detailed, and polite answers to the human's questions.\n\n"
)
article = ""  # insert a document here
prompt = f"### Human: Please summarize the following article.\n\n{article}.\n###"

inputs = tokenizer(header + prompt, return_tensors="pt")
sample = model.generate(**inputs, do_sample=True, max_new_tokens=2048, top_k=100, eos_token_id=50256)
output = tokenizer.decode(sample[0])
print(output.strip().replace("Assistant:", ""))

引用

@misc{XGen,
  title={Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length},
  author={Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong},
  howpublished={Salesforce AI Research Blog},
  year={2023},
  url={https://blog.salesforceairesearch.com/xgen}
}