模型:
TehVenom/MPT-7b-Chat-Instruct-LongCTX-Merge
这是一个合并模型,使用加权参数混合策略,比例为(20:20:60),合并了以下模型:
最终模型由以下部分组成:
(MTP-7b Storywriter [20%] + MTP-7b Instruct [20%]) + MTP-7b Chat [60%]
这样做是为了测试“长上下文”调整如何影响注意力,当它与已经用于不同目的且在较短上下文范围上训练的模型合并在一起时。与之前的合并模型 (That sports a 50/50 ratio) 相比,这个模型更倾向于Chat基础模型,以便对比CTMX范围合并的效果,并且拥有一个主要专注于聊天的模型。
这个合并的目标有两个,第一个目标是看看65k-Storywriter模型中有多少是必要的,以提高最终模型的上下文大小范围,同时通过将文学/Instruct类型的模型添加到聊天模型中,使其更有趣/富有表现力和智能。
由于MPT-7b Storywriter的影响,该模型可能生成被认为NSFW的内容,因为MPT-7b Storywriter引用了各种各样的书籍。
具体提示内容未知,但可以尝试将其视为聊天机器人场景/提示。
可以尝试以两行提示开始,例如:
Bot: "Hello, how are you?" You: "I am doing just fine, thank you."
或者可以试着给它指令,例如:
Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: Explain the concept of artificial intelligence in simple terms. ### Response: Artificial Intelligence (AI) is the ability of machines and computers to make decisions and complete tasks similar to what humans can do. AI can learn from data, recognize patterns, and make predictions. AI technology can automate tedious and complex processes, helping to reduce human effort and errors, and enable more accurate and efficient decisions. AI can be used in a wide range of applications, from robotics and autonomous vehicles to healthcare and finance. It is increasingly becoming an integral part of everyday life.
查看聊天-7b模型中的数据集,以更好地掌握如何正确提问:
-(Anthropic/hh-rlhf)[ https://huggingface.co/datasets/Anthropic/hh-rlhf] -(tatsu-lab/alpaca)[ https://huggingface.co/datasets/tatsu-lab/alpaca] -(Hello-SimpleAI/HC3)[ https://huggingface.co/datasets/Hello-SimpleAI/HC3]
阅读原始模型卡片,了解如何在该模型上进行推理。