google/t5-small-ssm-nq | ATYUN.COM 官网-人工智能教程资讯全方位服务平台

模型:

google/t5-small-ssm-nq

任务:

文生文

类库:

PyTorch TensorFlow JAX Transformers

数据集:

c4 wikipedia natural_questions 3Anatural_questions 3Awikipedia 3Ac4

语言:

其他:

t5 AutoTrain Compatible text-generation-inference

预印本库:

arxiv:2002.08909 arxiv:1910.10683

许可:

apache-2.0

模型介绍文件清单

英文

Google's T5 对于闭卷问答。

该模型是使用T5的去噪目标进行预训练的，其中使用了 C4 进行附加预训练，采用了 REALM 的显著跨度屏蔽目标，并最终在 Natural Questions (NQ) 上进行了微调。

注意：该模型在 Natural Questions (NQ) 的训练数据完全上进行了10k步长的微调。

其他社区检查点： here

论文： How Much Knowledge Can You Pack Into the Parameters of a Language Model?

作者：Adam Roberts, Colin Raffel, Noam Shazeer

自然问题-测试集上的结果

Id	link	Exact Match
T5-small	1239321	25.5
T5-large	12310321	30.4
T5-xl	12311321	35.6
T5-xxl	12312321	37.9
T5-3b	12313321	33.2
T5-11b	12314321	36.6

使用方法

闭卷问答模型的使用方式如下：

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

t5_qa_model = AutoModelForSeq2SeqLM.from_pretrained("google/t5-small-ssm-nq")
t5_tok = AutoTokenizer.from_pretrained("google/t5-small-ssm-nq")

input_ids = t5_tok("When was Franklin D. Roosevelt born?", return_tensors="pt").input_ids
gen_output = t5_qa_model.generate(input_ids)[0]

print(t5_tok.decode(gen_output, skip_special_tokens=True))

摘要

最近观察到，对于在非结构化文本上进行训练的神经语言模型可以使用自然语言查询隐式地存储和检索知识。在这篇简短的论文中，我们通过微调预训练模型来测量该方法的实际效用，以便在没有任何外部上下文或知识的情况下回答问题。我们展示了这种方法可以根据模型的规模进行扩展，并且在回答问题时与明确从外部知识源中检索答案的开放域系统竞争。为了便于再现和未来的研究，我们在 https://goo.gle/t5-cbqa 上发布了我们的代码和训练模型。

作者:

Google AI

数据集大小:

881.97 MB