模型:
pszemraj/bigbird-pegasus-large-K-booksum
这是训练时间最长的 "最新" 版本的模型,目前已经进行了70k步的训练
包括批量摘要演示的扩展示例在 here 中
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer from transformers import pipeline model = AutoModelForSeq2SeqLM.from_pretrained( "pszemraj/bigbird-pegasus-large-K-booksum", low_cpu_mem_usage=True, ) tokenizer = AutoTokenizer.from_pretrained( "pszemraj/bigbird-pegasus-large-K-booksum", ) summarizer = pipeline( "summarization", model=model, tokenizer=tokenizer, )
wall_of_text = "your text to be summarized goes here." result = summarizer( wall_of_text, min_length=16, max_length=256, no_repeat_ngram_size=3, clean_up_tokenization_spaces=True, ) print(result[0]["summary_text"])