Hitokomoru扩散V2

该模型是一个潜在的扩散模型，经过训练以生成日本艺术家的艺术作品，ヒトこもる/Hitokomoru 。当前模型是从 waifu-diffusion-1-4 （ wd-1-4-anime_e2.ckpt ）进行微调的，使用学习率为 2.0e-6 ，进行了15000个训练步骤，并且使用4个批次大小，在从Danbooru收集的257件艺术作品上进行训练。该模型是从 hitokomoru-diffusion 中断的Fine-tune继续训练的，该模型是从Anything V3.0进行Fine-tune的。数据集使用 Aspect Ratio Bucketing Tool 进行了预处理，以便将其转换为潜变量并在非方形分辨率下进行训练。与其他动漫风格的稳定扩散模型一样，该模型还支持使用Danbooru标签来生成图像。

例如：1girl，白色头发，金色眼睛，美丽的眼睛，细节，花田，积云，闪电，细致的天空，花园

使用 Automatic1111's Stable Diffusion Webui 查看如何使用
与 ? 雾化器一起使用

模型详细信息

开发者：Linaqruf
模型类型：基于扩散的文本到图像生成模型
模型类型：这是一个可以根据文本提示生成和修改图像的模型
许可证： CreativeML Open RAIL++-M License
来自模型的微调： waifu-diffusion-v1-4-epoch-2

如何使用

下载 hitokomoru-v2.ckpt here ，或下载safetensors版本 here 。
此模型是从 waifu-diffusion-v1-4-epoch-2 进行微调的，该模型又是从 stable-diffusion-2-1-base 进行微调的。因此，为了在 Automatic1111's Stable Diffusion Webui 中运行此模型，您需要将推理配置的 .YAML 文件放在模型旁边，您可以在 here 找到它
您需要使用美学标签调整您的提示，基于 Official Waifu Diffusion 1.4 release notes ，引导模型生成高美学作品的理想负面提示应如下所示：

worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry

此外，为了获得高美学结果，还应在提示前加上以下内容：

masterpiece, best quality, high quality, absurdres

? 雾化器

这个模型可以像任何其他稳定扩散模型一样使用。更多信息，请查看 Stable Diffusion 。您还可以将模型导出为 ONNX ， MPS 和/或 FLAX/JAX 。

您应该按照以下顺序安装依赖项才能运行流程

pip install diffusers transformers accelerate scipy safetensors

运行流程（如果您不更换调度程序，则默认使用DDIM运行，在此示例中，我们将其更换为DPMSolverMultistepScheduler）：

import torch
from torch import autocast
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler

model_id = "Linaqruf/hitokomoru-diffusion-v2"

# Use the DPMSolverMultistepScheduler (DPM-Solver++) scheduler here instead
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "masterpiece, best quality, high quality, 1girl, solo, sitting, confident expression, long blonde hair, blue eyes, formal dress"
negative_prompt = "worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry"

with autocast("cuda"):
    image = pipe(prompt, 
                 negative_prompt=negative_prompt, 
                 width=512,
                 height=728,
                 guidance_scale=12,
                 num_inference_steps=50).images[0]
    
image.save("anime_girl.png")

示例

这里是一些精选的样本：

示例图像的提示和设置

masterpiece, best quality, high quality, 1girl, solo, sitting, confident expression, long blonde hair, blue eyes, formal dress, jewelry, make-up, luxury, close-up, face, upper body.

Negative prompt: worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry

Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 994051800, Size: 512x768, Model hash: ea61e913a0, Model: hitokomoru-v2, Batch size: 2, Batch pos: 0, Denoising strength: 0.6, Clip skip: 2, ENSD: 31337, Hires upscale: 1.5, Hires steps: 20, Hires upscaler: Latent (nearest-exact)

许可证

该模型是开放访问的，并按照CreativeML OpenRAIL-M许可协议进一步确定权利和用途。CreativeML OpenRAIL许可证规定：

您不能使用该模型故意生成或共享非法或有害的输出或内容

作者对您生成的输出不享有任何权利，您可以自由使用它们，但需要对使用它们的行为负责，不能违反许可证中设置的规定

您可以重新分发权重并将该模型用于商业和/或服务。如果您这样做，请注意您必须包括与许可证中相同的使用限制，并向所有用户共享CreativeML OpenRAIL-M的副本（请完整且仔细阅读许可证） Please read the full license here

致谢

ヒトこもる/Hitokomoru 用于数据集

作者:

Linaqruf

数据集大小:

15.56 GB