模型:
lllyasviel/control_v11p_sd15_scribble
Controlnet v1.1 是 Controlnet v1.0 的后续模型,于 lllyasviel/ControlNet-v1-1 发布,由 Lvmin Zhang 开发。
此检查点是将 the original checkpoint 转换为扩散器格式。它可以与稳定扩散器(如 runwayml/stable-diffusion-v1-5 )结合使用。
更多细节,请参阅 ? Diffusers docs 。
ControlNet是一种神经网络结构,用于通过添加额外条件来控制扩散模型。
此检查点对应于基于涂鸦图像的ControlNet
开发者:Lvmin Zhang,Maneesh Agrawala
模型类型:基于扩散的文本到图像生成模型
语言:英语
许可证: The CreativeML OpenRAIL M license 是 Open RAIL M license ,改编自 BigScience 和 the RAIL Initiative 合作从事负责任的AI许可证领域的工作。另请参阅我们许可证的 the article about the BLOOM Open RAIL license 。
更多信息资源: GitHub Repository , Paper 。
引用如下:
@misc{zhang2023adding,title={Adding Conditional Control to Text-to-Image Diffusion Models},author={Lvmin Zhang and Maneesh Agrawala},year={2023},eprint={2302.05543},archivePrefix={arXiv},primaryClass={cs.CV}}
Controlnet是由Lvmin Zhang和Maneesh Agrawala在 Adding Conditional Control to Text-to-Image Diffusion Models 提出的。
摘要如下所示:
我们提出了一种神经网络结构ControlNet,用于控制预训练的大规模扩散模型以支持额外的输入条件。 ControlNet以端到端方式学习任务特定的条件,并且即使训练数据集很小(<50k),学习也很稳健。而且,训练ControlNet与微调扩散模型的速度相同,并且可以在个人设备上进行训练。或者,如果有强大的计算集群可用,该模型可以扩展到大量(百万到数十亿)的数据。我们报告了像Stable Diffusion这样的大规模扩散模型可以通过ControlNets进行增强,以实现边缘图,分割图,关键点等条件输入。这可以丰富控制大规模扩散模型的方法并进一步促进相关应用。
推荐使用 Stable Diffusion v1-5 作为检查点,因为检查点已经在 Stable Diffusion v1-5 上进行了训练。实验上,检查点可以与其他扩散模型(如dreamboothed stable diffusion)一起使用。
注意:如果要处理图像以创建辅助条件,则需要以下外部依赖项:
$ pip install controlnet_aux==0.3.0
$ pip install diffusers transformers accelerate
import torch import os from huggingface_hub import HfApi from pathlib import Path from diffusers.utils import load_image from PIL import Image import numpy as np from controlnet_aux import PidiNetDetector, HEDdetector from diffusers import ( ControlNetModel, StableDiffusionControlNetPipeline, UniPCMultistepScheduler, ) checkpoint = "lllyasviel/control_v11p_sd15_scribble" image = load_image( "https://huggingface.co/lllyasviel/control_v11p_sd15_scribble/resolve/main/images/input.png" ) prompt = "royal chamber with fancy bed" processor = HEDdetector.from_pretrained('lllyasviel/Annotators') control_image = processor(image, scribble=True) control_image.save("./images/control.png") controlnet = ControlNetModel.from_pretrained(checkpoint, torch_dtype=torch.float16) pipe = StableDiffusionControlNetPipeline.from_pretrained( "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16 ) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) pipe.enable_model_cpu_offload() generator = torch.manual_seed(0) image = pipe(prompt, num_inference_steps=30, generator=generator, image=control_image).images[0] image.save('images/image_out.png')
作者发布了14个不同的检查点,每个检查点都是在不同类型的条件(由 Stable Diffusion v1-5 进行训练)下进行训练的:
Model Name | Control Image Overview | Condition Image | Control Image Example | Generated Image Example |
---|---|---|---|---|
12320321 | Trained with canny edge detection | A monochrome image with white edges on a black background. | 12321321 12322321||
12323321 | Trained with pixel to pixel instruction | No condition . | 12324321 12325321||
12326321 | Trained with image inpainting | No condition. | 12327321 12328321||
12329321 | Trained with multi-level line segment detection | An image with annotated line segments. | 12330321 12331321||
12332321 | Trained with depth estimation | An image with depth information, usually represented as a grayscale image. | 12333321 12334321||
12335321 | Trained with surface normal estimation | An image with surface normal information, usually represented as a color-coded image. | 12336321 12337321||
12338321 | Trained with image segmentation | An image with segmented regions, usually represented as a color-coded image. | 12339321 12340321||
12341321 | Trained with line art generation | An image with line art, usually black lines on a white background. | 12342321 12343321||
12344321 | Trained with anime line art generation | An image with anime-style line art. | 12345321 12346321||
12347321 | Trained with human pose estimation | An image with human poses, usually represented as a set of keypoints or skeletons. | 12348321 12349321||
12350321 | Trained with scribble-based image generation | An image with scribbles, usually random or user-drawn strokes. | 12351321 12352321||
12353321 | Trained with soft edge image generation | An image with soft edges, usually to create a more painterly or artistic effect. | 12354321 12355321||
12356321 | Trained with image shuffling | An image with shuffled patches or regions. | 12357321 12358321||
12359321 | Trained with image tiling | A blurry image or part of an image . | 12360321 12361321
更多信息,请查看 Diffusers ControlNet Blog Post ,并查看 official docs 。