模型:

lllyasviel/control_v11p_sd15_lineart

英文

Controlnet - v1.1 - 线性图版

Controlnet v1.1 由 Lvmin Zhang lllyasviel/ControlNet-v1-1 发布。

此检查点是将 the original checkpoint 转换为扩散器格式的版本,可与稳定扩散如 runwayml/stable-diffusion-v1-5 结合使用。

更多详情,请参阅 ? Diffusers docs

ControlNet 是一个神经网络结构,用于通过添加额外条件来控制扩散模型。

此检查点对应于条件为线性图像的 ControlNet 版本。

模型细节

简介

Controlnet 是由 Lvmin Zhang, Maneesh Agrawala 在 Adding Conditional Control to Text-to-Image Diffusion Models 提出的。

摘要如下:

我们提出了一种名为 ControlNet 的神经网络结构,用于控制预训练的大型扩散模型,以支持额外的输入条件。ControlNet 在端到端的方式下学习任务特定的条件,并且即使训练数据集很小(< 50k),学习也很稳健。此外,训练 ControlNet 的速度与微调扩散模型的速度相同,并且可以在个人设备上进行训练。或者,如果有强大的计算集群可用,模型可以扩展到大量(百万到十亿)的数据。我们报告了像 Stable Diffusion 这样的大型扩散模型可以通过添加 ControlNets 来实现条件输入,如边缘图、分割图、关键点等。这可能丰富了控制大型扩散模型的方法并进一步促进相关应用。

示例

建议使用基于 Stable Diffusion v1-5 训练的检查点 Stable Diffusion v1-5 。根据实验,检查点可以与其他扩散模型(如 dreamboothed 稳定扩散)一起使用。

注意:如果要处理图像以创建辅助条件,需要以下外部依赖项:

  • 安装 https://github.com/patrickvonplaten/controlnet_aux
  • $ pip install controlnet_aux==0.3.0
    
  • 安装 diffusers 和相关软件包:
  • $ pip install diffusers transformers accelerate
    
  • 运行代码:
  • import torch
    import os
    from huggingface_hub import HfApi
    from pathlib import Path
    from diffusers.utils import load_image
    from PIL import Image
    import numpy as np
    from controlnet_aux import LineartDetector
    
    from diffusers import (
        ControlNetModel,
        StableDiffusionControlNetPipeline,
        UniPCMultistepScheduler,
    )
    
    checkpoint = "ControlNet-1-1-preview/control_v11p_sd15_lineart"
    
    image = load_image(
        "https://huggingface.co/ControlNet-1-1-preview/control_v11p_sd15_lineart/resolve/main/images/input.png"
    )
    image = image.resize((512, 512))
    
    prompt = "michael jackson concert"
    processor = LineartDetector.from_pretrained("lllyasviel/Annotators")
    
    control_image = processor(image)
    control_image.save("./images/control.png")
    
    controlnet = ControlNetModel.from_pretrained(checkpoint, torch_dtype=torch.float16)
    pipe = StableDiffusionControlNetPipeline.from_pretrained(
        "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16
    )
    
    pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
    pipe.enable_model_cpu_offload()
    
    generator = torch.manual_seed(0)
    image = pipe(prompt, num_inference_steps=30, generator=generator, image=control_image).images[0]
    
    image.save('images/image_out.png')
    

    其他发布的检查点 v1-1

    作者发布了14个不同的检查点,每个检查点都使用 Stable Diffusion v1-5 在不同类型的条件下进行训练:

    12320321 12321321 12323321 12324321 12326321 12327321 12329321 12330321 12332321 12333321 12335321 12336321 12338321 12339321 12341321 12342321 12344321 12345321 12347321 12348321 12350321 12351321 12353321 12354321 12356321 12357321 12359321 12360321
    Model Name Control Image Overview Condition Image Control Image Example Generated Image Example
    12319321 Trained with canny edge detection A monochrome image with white edges on a black background.
    12322321 Trained with pixel to pixel instruction No condition .
    12325321 Trained with image inpainting No condition.
    12328321 Trained with multi-level line segment detection An image with annotated line segments.
    12331321 Trained with depth estimation An image with depth information, usually represented as a grayscale image.
    12334321 Trained with surface normal estimation An image with surface normal information, usually represented as a color-coded image.
    12337321 Trained with image segmentation An image with segmented regions, usually represented as a color-coded image.
    12340321 Trained with line art generation An image with line art, usually black lines on a white background.
    12343321 Trained with anime line art generation An image with anime-style line art.
    12346321 Trained with human pose estimation An image with human poses, usually represented as a set of keypoints or skeletons.
    12349321 Trained with scribble-based image generation An image with scribbles, usually random or user-drawn strokes.
    12352321 Trained with soft edge image generation An image with soft edges, usually to create a more painterly or artistic effect.
    12355321 Trained with image shuffling An image with shuffled patches or regions.
    12358321 Trained with image tiling A blurry image or part of an image .

    更多信息

    有关更多信息,请参阅 Diffusers ControlNet Blog Post 并查看 official docs