akoksal/LongForm-OPT-2.7B | ATYUN.COM 官网-人工智能教程资讯全方位服务平台

模型:

akoksal/LongForm-OPT-2.7B

任务:

文本生成

类库:

PyTorch Safetensors Transformers

数据集:

akoksal/LongForm 3Aakoksal/LongForm

语言:

其他:

opt instruction-tuning text-generation-inference 文生文

预印本库:

arxiv:2304.08460

模型介绍文件清单

英文

LongForm-OPT-2.7B

LongForm数据集是通过增强指令来利用英语语料库示例创建的。我们从现有的语料库（如C4和维基百科）中选择了一组多样性的人类撰写文件，并通过LLMs为这些文件生成指令。然后，我们使用结构化语料库示例（如Stack Exchange和WikiHow）和任务示例（如问题回答、电子邮件撰写、语法错误更正、故事/诗歌生成和文本摘要）扩展这些示例。

Github Repo: https://github.com/akoksal/LongForm

对于LongForm OPT和LLaMA模型: 使用[EOI]表示指令的结束。

LongForm- T5-XL: https://huggingface.co/akoksal/LongForm-T5-XL

LongForm- OPT-6.7B: https://huggingface.co/akoksal/LongForm-OPT-6.7B

如何加载

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("akoksal/LongForm-OPT-2.7B")
tokenizer = AutoTokenizer.from_pretrained("akoksal/LongForm-OPT-2.7B")

instruction = "Write an essay about meditation. [EOI]"
torch.manual_seed(42)
input_ids = tokenizer(instruction, return_tensors="pt").input_ids
target_ids = model.generate(input_ids, do_sample=True, max_new_tokens=50, top_p=0.9)
tokenizer.decode(target_ids[0], skip_special_tokens=True)
# Output:
# > Write an essay about meditation. [EOI]Do you need some inspiration to\
# meditate? Do you know someone who is a great meditator but you aren't sure\
# what to say to them? This might be the perfect opportunity to tell them.\
# The ability to listen and learn and grow can

评估

我们在论文中对LongForm模型和基线进行了深入评估。我们提供了模型在域外数据集中的METEOR分数。在所有任务中，食谱生成（RGen）、长篇问题回答（ELI5）、短篇故事生成（WritingPrompts/WP）方面，LongForm模型优于先前调节指令的模型。

All	Recipe Generation	ELI5	Writing Prompts
T0++	10.9	18.7	3.8	10.2
Tk-Instruct	6.3	12.9*	3.6	2.4
Flan-T5	10.6	20.9*	3.5	7.4
Alpaca-LLaMA-7B	14.6	19.5	12.5	11.8
OPT-30B	11.1	18.6	12.2	2.6
1235321	16.3	20.2	18.3	10.6
1236321	17.8	15.5	17.9	19.9
1237321	17.7	16.9	17.2	19.0
1238321 ‡	19.7	21.7	18.6	18.9

LongForm-OPT模型的较小版本也可用：

‡: 由于LLaMA模型的限制，我们只能公开发布LongForm-LLaMA-7B与预训练LLaMA-7B之间的差异。

限制

LongForm数据集和模型主要关注长文本生成，并对NLP中的结构预测任务存在一些限制。此外，我们观察到LongForm模型可能会出现类似LLMs中的幻觉问题。

许可协议

LongForm项目受MIT许可证的约束，其中还有由OpenAI（用于指令生成部分）以及语言模型（OPT、LLaMA和T5）强加的自定义限制。

引用

@misc{koksal2023longform,
      title={LongForm: Optimizing Instruction Tuning for Long Text Generation with Corpus Extraction}, 
      author={Abdullatif Köksal and Timo Schick and Anna Korhonen and Hinrich Schütze},
      year={2023},
      eprint={2304.08460},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

作者:

Abdullatif Koksal

数据集大小:

9.88 GB