模型:
pszemraj/bigbird-pegasus-large-K-booksum
这是训练时间最长的 "最新" 版本的模型,目前已经进行了70k步的训练
包括批量摘要演示的扩展示例在 here 中
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
from transformers import pipeline
model = AutoModelForSeq2SeqLM.from_pretrained(
    "pszemraj/bigbird-pegasus-large-K-booksum",
    low_cpu_mem_usage=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "pszemraj/bigbird-pegasus-large-K-booksum",
)
summarizer = pipeline(
    "summarization",
    model=model,
    tokenizer=tokenizer,
)          
 wall_of_text = "your text to be summarized goes here."
result = summarizer(
    wall_of_text,
    min_length=16,
    max_length=256,
    no_repeat_ngram_size=3,
    clean_up_tokenization_spaces=True,
)
print(result[0]["summary_text"])