google/t5-large-ssm-nq | ATYUN.COM 官网-人工智能教程资讯全方位服务平台

模型:

google/t5-large-ssm-nq

任务:

文生文

类库:

PyTorch TensorFlow JAX Transformers

数据集:

c4 wikipedia natural_questions 3Anatural_questions 3Awikipedia 3Ac4

语言:

其他:

t5 AutoTrain Compatible text-generation-inference

预印本库:

arxiv:2002.08909 arxiv:1910.10683

许可:

apache-2.0

模型介绍文件清单

英文

Google's T5 对于闭卷问答。

该模型使用T5的去噪目标在 C4 上进行了预训练，随后使用 REALM 的显著跨度屏蔽目标进行了额外的预训练，最后在 Natural Questions (NQ) 上进行了微调。

注意：该模型在 Natural Questions (NQ) 的所有训练集上进行了10k步的微调。

其他社区的检查点： here

论文： How Much Knowledge Can You Pack Into the Parameters of a Language Model?

作者：Adam Roberts，Colin Raffel，Noam Shazeer

自然问题-测试集结果

Id	link	Exact Match
T5-small	1239321	25.5
T5-large	12310321	30.4
T5-xl	12311321	35.6
T5-xxl	12312321	37.9
T5-3b	12313321	33.2
T5-11b	12314321	36.6

用法

该模型可用于闭卷问答，使用方法如下：

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

t5_qa_model = AutoModelForSeq2SeqLM.from_pretrained("google/t5-large-ssm-nq")
t5_tok = AutoTokenizer.from_pretrained("google/t5-large-ssm-nq")

input_ids = t5_tok("When was Franklin D. Roosevelt born?", return_tensors="pt").input_ids
gen_output = t5_qa_model.generate(input_ids)[0]

print(t5_tok.decode(gen_output, skip_special_tokens=True))

# should give "December 26, 1892" => close, but not correct.

摘要

最近观察到，神经语言模型在训练时可以使用自然语言查询隐式存储和检索知识。在这篇简短的论文中，我们通过微调预训练模型来衡量这种方法的实际效用，以便在没有任何外部环境或知识的情况下回答问题。我们表明，这种方法随着模型规模的增大而扩展，并在回答问题时与显式从外部知识源中检索答案的开放域系统具有竞争力。为了便于重现和未来的工作，我们在 https://goo.gle/t5-cbqa 上发布了我们的代码和训练模型。

作者:

Google AI

数据集大小:

8.25 GB