模型:
togethercomputer/GPT-NeoXT-Chat-Base-20B
随意尝试我们的 OpenChatKit feedback app !
TLDR:作为OpenChatKit的一部分(代码库可在 here 中找到),GPT-NeoXT-Chat-Base-20B-v0.16是一个20B参数的语言模型,通过EleutherAI的GPT-NeoX进行了超过4000万次指令的100%负碳的计算。
GPT-NeoXT-Chat-Base-20B-v0.16基于EleutherAI的GPT-NeoX模型,并使用关注对话式交互的数据进行了微调。我们将调整重点放在了问答、分类、抽取和摘要等多个任务上。我们使用了4300万条高质量指令的集合对模型进行了微调。LAION和Ontocord.ai是我们的合作伙伴,他们共同帮助策划了该模型所基于的数据集。您可以在LAION的博客文章 here 中了解更多关于此过程和该数据集的可用性的信息。
除了上述的微调,GPT-NeoXT-Chat-Base-20B-v0.16还通过一小部分反馈数据进行了进一步的微调。这使得模型能够更好地适应人类在对话中的偏好。
这需要一块48GB内存的GPU。
from transformers import AutoTokenizer, AutoModelForCausalLM # init tokenizer = AutoTokenizer.from_pretrained("togethercomputer/GPT-NeoXT-Chat-Base-20B") model = AutoModelForCausalLM.from_pretrained("togethercomputer/GPT-NeoXT-Chat-Base-20B", torch_dtype=torch.float16) model = model.to('cuda:0') # infer inputs = tokenizer("<human>: Hello!\n<bot>:", return_tensors='pt').to(model.device) outputs = model.generate(**inputs, max_new_tokens=10, do_sample=True, temperature=0.8) output_str = tokenizer.decode(outputs[0]) print(output_str)
这需要一块24GB内存的GPU。
from transformers import AutoTokenizer, AutoModelForCausalLM # init tokenizer = AutoTokenizer.from_pretrained("togethercomputer/GPT-NeoXT-Chat-Base-20B") model = AutoModelForCausalLM.from_pretrained("togethercomputer/GPT-NeoXT-Chat-Base-20B", device_map="auto", load_in_8bit=True) # infer inputs = tokenizer("<human>: Hello!\n<bot>:", return_tensors='pt').to(model.device) outputs = model.generate(**inputs, max_new_tokens=10, do_sample=True, temperature=0.8) output_str = tokenizer.decode(outputs[0]) print(output_str)
from transformers import AutoTokenizer, AutoModelForCausalLM # init tokenizer = AutoTokenizer.from_pretrained("togethercomputer/GPT-NeoXT-Chat-Base-20B") model = AutoModelForCausalLM.from_pretrained("togethercomputer/GPT-NeoXT-Chat-Base-20B", torch_dtype=torch.bfloat16) # infer inputs = tokenizer("<human>: Hello!\n<bot>:", return_tensors='pt').to(model.device) outputs = model.generate(**inputs, max_new_tokens=10, do_sample=True, temperature=0.8) output_str = tokenizer.decode(outputs[0]) print(output_str)
OpenChatKit在多个任务上表现出色。包括:
**Summarize a long document into a single sentence and conduct question answering related to the document, with multiple rounds** <human>: Last year, the travel industry saw a big rebound in demand — and that demand is showing no signs of slowing down this spring break travel season. Planes and hotels will be full, travelers will likely face long queues, cancellations, massive crowds and plenty of other travel nightmares. But perhaps the most frustrating thing you’ll have to prepare for this spring break is if your luggage goes missing. You’ve got to somehow track down your belongings that were in the airline’s care and then wait to be reunited — all while trying to enjoy the start of your long-awaited vacation. As far as travel nightmares go, lost luggage is up there as one of the worst. To be honest up front, I am great at strategically losing things — or rather, losing stuff with style. (I hope you appreciate the “Toy Story” reference.) So when a product comes along that’s a perfect way to keep track of my belongings — especially when traveling — I’m adding it to my travel bag. I’ve been a big fan of Apple’s $29 item tracker, dubbed AirTag. The long-rumored AirTag product finally launched in 2021, and while it’s not the first item tracker to exist, it has two key features that set it apart from the rest to make it the ultimate travel companion: access to the Find My network and Precision Finding technology. The Federal Aviation Administration has even ruled that having an AirTag in your luggage is permitted and, in fact, not a safety hazard. The devices only contain 0.1 grams of lithium, which falls below the FAA’s limit. “Luggage tracking devices powered by lithium metal cells that have 0.3 grams or less of lithium can be used on checked baggage,” the FAA said in a statement. “Apple AirTags meet this threshold; other luggage tracking devices may not.” But, you may be wondering what makes them so special and why you should carry one in your checked luggage this spring break? Let’s explain why. Summarize the above into a single paragraph. <bot>: If you're traveling this spring break, you might want to consider investing in an item tracker to help you find your lost luggage. Apple's AirTag is the best option, according to a new report. <human>: But isn't it unsafe to put AirTag in a luggage? <bot>: The Federal Aviation Administration has ruled that having an AirTag in your luggage is permitted and, in fact, not a safety hazard. <human>: Ah, why is that? <bot>: The devices only contain 0.1 grams of lithium, which falls below the FAA’s limit.
**Extract structured information from unstructured documents, as illustrated in the following example.** <human>: Generate a table summarizing the options outlined in this email. Team, we have to make a decision on the product plan for Goocrux v2. The first option will take us 3 weeks of development, and will achieve performance of 80ms, but means that the UI has to change. The second option will take us 4 weeks of development, and will achieve performance of 60ms, without any UI change. The third option will take the longest -- 6 weeks of development, will achieve performance of 30ms, without any UI change. A markdown table with four columns: (1) Option ID, (2) Development time, (3) Performance, (4) UI Change: <bot>: #. | ID | Option ID | Development time | Performance | UI Change | | --------- | --------- | --------- | --------- | --------- | | 1 | 1 | 3 weeks | 80ms | Yes | | 2 | 2 | 4 weeks | 60ms | No | | 3 | 3 | 6 weeks | 30ms | No |
**Classifying a sentence or paragraph into different categories.** <human>: Classify the sentiment of the following sentence into Positive, Neutral, or Negative: Super excited about teaching Stanford’s first course on Large Language Models! Check the syllabus out here <bot>: Positive <human>: Classify the sentiment of the following sentence into Positive, Neutral, or Negative: How about the following sentence: It is raining outside and I feel so blue <bot>: Negative
此外,该模型在少样本提示下表现出色。对于分类和抽取,模型在少样本情况下的表现更好,就像大多数HELM任务一样。如果您有兴趣尝试使用少样本提示与该模型,请使用 Contact us 。
话虽如此,我们仍有一些工作需要做,需要您的帮助!其中一些问题包括:
我们非常期待通过您的反馈、增加数据集和提高准确性来解决这些问题。
该模型旨在用于研究目的。可能的研究领域和任务包括:
以下是不包括在内的用途。
OpenChatKit社区提供GPT-NeoXT-Chat-Base-20B-v0.16作为构建聊天机器人的开源工具。社区不对模型的滥用、恶意使用或超范围使用负责。最终用户有责任确保以负责任和合乎道德的方式使用该模型。
不在范围内的使用GPT-NeoXT-Chat-Base-20B-v0.16设计用于聊天机器人应用,并且可能在其预期范围之外的其他用例中表现不佳。例如,它可能不适用于安全关键应用程序或对个人或社会产生重大影响的决策。重要的是要考虑模型的局限性,并仅将其用于预期目的。
滥用和恶意使用GPT-NeoXT-Chat-Base-20B-v0.16设计用于聊天机器人应用,不应用于其他目的。严禁滥用模型,例如用于从事非法或不道德的活动,这与OpenChatKit社区项目的原则相违背。
使用该模型生成对个人残忍的内容是对该模型的滥用。这包括但不限于:
GPT-NeoXT-Chat-Base-20B-v0.16像其他基于语言模型的聊天机器人一样存在一些限制,这些限制应予以考虑。例如,该模型可能无法始终提供准确或相关的答案,特别是对于复杂、模糊或超出其训练数据范围的问题。因此,我们欢迎个人和组织的贡献,并鼓励合作,创建更强大和包容性的聊天机器人。
训练数据
请参考 togethercomputer/OpenDataHub
训练过程
在 Together Discord 上加入我们。