模型:

lllyasviel/control_v11p_sd15_openpose

任务:

图生图

类库:

Diffusers

其他:

art controlnet stable-diffusion controlnet-v1-1

预印本库:

arxiv:2302.05543

许可:

openrail

模型介绍文件清单

英文

Controlnet - v1.1 - openpose 版本

Controlnet v1.1 是 Controlnet v1.0 的后继模型，于 lllyasviel/ControlNet-v1-1 由 Lvmin Zhang 发布。

此检查点是将 the original checkpoint 转换为 diffusers 格式的结果。它可以与稳定扩散 (Stable Diffusion) 结合使用，例如 runwayml/stable-diffusion-v1-5 。

更多细节，请参考 ? Diffusers docs 。

ControlNet 是一种通过添加额外条件来控制扩散模型的神经网络结构。

此检查点对应于基于 openpose 图像的 ControlNet。

模型细节

开发者：Lvmin Zhang, Maneesh Agrawala
模型类型：基于扩散的文本到图像生成模型
语言：英语
许可证： The CreativeML OpenRAIL M license 是 Open RAIL M license 的修改版，是 Lvmin Zhang 和 Maneesh Agrawala 在负责 AI 许可方面的联合工作的成果。我们的许可证基于 the article about the BLOOM Open RAIL license 。
更多信息资源： GitHub Repository ， Paper
引用：

@misc{zhang2023adding, title={Adding Conditional Control to Text-to-Image Diffusion Models}, author={Lvmin Zhang and Maneesh Agrawala}, year={2023}, eprint={2302.05543}, archivePrefix={arXiv}, primaryClass={cs.CV}}

引言

Controlnet 是由 Lvmin Zhang, Maneesh Agrawala 在 Adding Conditional Control to Text-to-Image Diffusion Models 提出的。

摘要如下：

我们提出了一种神经网络结构 ControlNet，用于控制预训练的大规模扩散模型以支持附加输入条件。ControlNet 可以以端到端的方式学习任务特定的条件，即使训练数据集很小 (< 50k)，学习仍然稳健。此外，训练 ControlNet 的速度与微调扩散模型的速度一样快，可以在个人设备上进行训练。如果有强大的计算集群可用，该模型可以扩展到大量的数据（百万到十亿级）。我们报告了像 Stable Diffusion 这样的大规模扩散模型可以通过 ControlNet 进行增强，以实现像边缘映射、分割映射、关键点等有条件的输入。这可能丰富了控制大规模扩散模型的方法，并进一步促进了相关应用。

示例

推荐使用与 Stable Diffusion v1-5 相对应的检查点，因为该检查点已经在其上进行了训练。实验上，该检查点可以与其他扩散模型（如 dreamboothed stable diffusion）一起使用。

注意：如果您想处理图像以创建辅助条件，请按照以下示例安装外部依赖项：

安装 https://github.com/patrickvonplaten/controlnet_aux

$ pip install controlnet_aux==0.3.0

让我们安装 diffusers 和相关包：

$ pip install diffusers transformers accelerate

运行代码：

import torch
import os
from huggingface_hub import HfApi
from pathlib import Path
from diffusers.utils import load_image
from PIL import Image
import numpy as np
from controlnet_aux import OpenposeDetector

from diffusers import (
    ControlNetModel,
    StableDiffusionControlNetPipeline,
    UniPCMultistepScheduler,
)

checkpoint = "lllyasviel/control_v11p_sd15_openpose"

image = load_image(
    "https://huggingface.co/lllyasviel/control_v11p_sd15_openpose/resolve/main/images/input.png"
)

prompt = "chef in the kitchen"

processor = OpenposeDetector.from_pretrained('lllyasviel/ControlNet')

control_image = processor(image, hand_and_face=True)
control_image.save("./images/control.png")

controlnet = ControlNetModel.from_pretrained(checkpoint, torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16
)

pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()

generator = torch.manual_seed(0)
image = pipe(prompt, num_inference_steps=30, generator=generator, image=control_image).images[0]

image.save('images/image_out.png')

其他已发布的检查点 v1-1

作者发布了 14 个不同的检查点，每个检查点都是在不同类型的条件下使用 Stable Diffusion v1-5 进行训练的：

12321321 12322321 12324321 12325321 12327321 12328321 12330321 12331321 12333321 12334321 12336321 12337321 12339321 12340321 12342321 12343321 12345321 12346321 12348321 12349321 12351321 12352321 12354321 12355321 12357321 12358321

Model Name	Control Image Overview	Control Image Example	Generated Image Example
12320321 Trained with canny edge detection	A monochrome image with white edges on a black background.
12323321 Trained with pixel to pixel instruction	No condition .
12326321 Trained with image inpainting	No condition.
12329321 Trained with multi-level line segment detection	An image with annotated line segments.
12332321 Trained with depth estimation	An image with depth information, usually represented as a grayscale image.
12335321 Trained with surface normal estimation	An image with surface normal information, usually represented as a color-coded image.
12338321 Trained with image segmentation	An image with segmented regions, usually represented as a color-coded image.
12341321 Trained with line art generation	An image with line art, usually black lines on a white background.
12344321 Trained with anime line art generation	An image with anime-style line art.
12347321 Trained with human pose estimation	An image with human poses, usually represented as a set of keypoints or skeletons.
12350321 Trained with scribble-based image generation	An image with scribbles, usually random or user-drawn strokes.
12353321 Trained with soft edge image generation	An image with soft edges, usually to create a more painterly or artistic effect.
12356321 Trained with image shuffling	An image with shuffled patches or regions.

Openpose 1.1 的改进：

主要改进在于我们对 OpenPose 实现的改进。我们仔细研究了 pytorch OpenPose 和 CMU 的 c++ Openpose 之间的区别。现在处理器应该更准确，特别是对于手部。处理器的改进导致了 Openpose 1.1 的改进。
支持更多输入（手部和人脸）。
先前 cnet 1.0 的训练数据集存在一些问题，包括：(1) 少数灰度人体图像被重复复制了成千上万次 (!!)，导致之前的模型在生成灰度人体图像时有些偏好；(2) 一些图像质量较低，非常模糊，或者存在明显的 JPEG 压缩伪影；(3) 一小部分图像由于我们数据处理脚本中的错误，导致配对的提示错误。新模型修复了训练数据集的所有问题，在许多情况下应该更合理。

Controlnet - v1.1 - openpose 版本

模型细节

引言

示例

其他已发布的检查点 v1-1

Openpose 1.1 的改进：

更多信息