数据集:
castorini/nq_gar-t5_expansions
该 repo 提供了 Natural Questions 语料库在 gar-T5 模型上的答案、标题和句子扩展。
有 dev 和 test 两个部分。
dev 部分的一个示例数据条目如下:
{ "id": "1", "predicted_answers": ["312"], "predicted_titles": ["Invisible Man"], "predicted_sentences": ["The Invisible Man First edition Author Ralph Ellison Cover artist M."] }
test 部分的一个示例数据条目如下:
{ "id": "1", "predicted_answers": ["May 18 , 2018"], "predicted_titles": ["Deadpool 2 *** Deadpool (film) *** Deadpool 2 (soundtrack) *** X-Men in other media"], "predicted_sentences": ["Deadpool 2 was released on May 18 , 2018 , with Leitch directing from a screenplay by Rhett Reese and Paul Wernick ."] }
加载数据集的示例代码如下:
data_files = {"dev":"dev/dev.jsonl", "test": "test/test.jsonl"} dataset = load_dataset('castorini/nq_gar-t5_expansions')