模型:
bigscience/bloomz-p3
我们提出了BLOOMZ & mT0模型系列,这是一组能够以零-shot方式遵循几十种语言的人类指令的模型。我们在我们的跨语言任务混合(xP3)上对BLOOM & mT5预训练的多语言语言模型进行微调,并发现得到的模型能够在未见任务和语言上进行跨语言的泛化。
Multitask finetuned on 1239321 . Recommended for prompting in English. | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Parameters | 300M | 580M | 1.2B | 3.7B | 13B | 560M | 1.1B | 1.7B | 3B | 7.1B | 176B |
Finetuned Model | 12310321 | 12311321 | 12312321 | 12313321 | 12314321 | 12315321 | 12316321 | 12317321 | 12318321 | 12319321 | 12320321 |
Multitask finetuned on 12321321 . Recommended for prompting in non-English. | |||||||||||
Finetuned Model | 12322321 | 12323321 | 12324321 | ||||||||
Multitask finetuned on 12325321 . Released for research purposes only. Strictly inferior to above models! | |||||||||||
Finetuned Model | 12326321 | 12327321 | 12328321 | ||||||||
Original pretrained checkpoints. Not recommended. | |||||||||||
Pretrained Model | 12329321 | 12330321 | 12331321 | 12332321 | 12333321 | 12334321 | 12335321 | 12336321 | 12337321 | 12338321 | 12339321 |
我们建议使用该模型执行自然语言表达的任务。例如,给定提示“Translate to English: Je t’aime”,该模型很可能回答“I love you”。我们在论文中提供了一些提示的想法:
欢迎在社区选项卡中分享您的创作!
# pip install -q transformers from transformers import AutoModelForCausalLM, AutoTokenizer checkpoint = "bigscience/bloomz-p3" tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint) inputs = tokenizer.encode("Translate to English: Je t’aime.", return_tensors="pt") outputs = model.generate(inputs) print(tokenizer.decode(outputs[0]))
# pip install -q transformers accelerate from transformers import AutoModelForCausalLM, AutoTokenizer checkpoint = "bigscience/bloomz-p3" tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype="auto", device_map="auto") inputs = tokenizer.encode("Translate to English: Je t’aime.", return_tensors="pt").to("cuda") outputs = model.generate(inputs) print(tokenizer.decode(outputs[0]))
# pip install -q transformers accelerate bitsandbytes from transformers import AutoModelForCausalLM, AutoTokenizer checkpoint = "bigscience/bloomz-p3" tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto", load_in_8bit=True) inputs = tokenizer.encode("Translate to English: Je t’aime.", return_tensors="pt").to("cuda") outputs = model.generate(inputs) print(tokenizer.decode(outputs[0]))
提示工程:性能可能因提示而异。对于BLOOMZ模型,我们建议在输入停止时要非常清楚,以避免模型试图继续输入。例如,在法语句子“Translate to English: Je t'aime”末尾没有句号(.)的提示,可能导致模型试图继续翻译法语句子。更好的提示可以是“Translate to English: Je t'aime.”,“Translate to English: Je t'aime. Translation: ”,“What is "Je t'aime." in English?”,其中对于模型什么时候回答是清楚的。此外,我们建议尽可能为模型提供更多上下文。例如,如果你希望它用泰卢固语回答,那么告诉模型,例如“用泰卢固语用一句话解释神经网络中的反向传播。”。
我们在表7中引用了我们的论文 paper & bigscience/evaluation-results 中关于未见任务零-shot结果的数据。侧边栏报告了每个数据集配置的最佳提示的零-shot性能。
@misc{muennighoff2022crosslingual, title={Crosslingual Generalization through Multitask Finetuning}, author={Niklas Muennighoff and Thomas Wang and Lintang Sutawika and Adam Roberts and Stella Biderman and Teven Le Scao and M Saiful Bari and Sheng Shen and Zheng-Xin Yong and Hailey Schoelkopf and Xiangru Tang and Dragomir Radev and Alham Fikri Aji and Khalid Almubarak and Samuel Albanie and Zaid Alyafeai and Albert Webson and Edward Raff and Colin Raffel}, year={2022}, eprint={2211.01786}, archivePrefix={arXiv}, primaryClass={cs.CL} }