模型:

IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1

任务:

文生图

类库:

Diffusers

语言:

其他:

stable-diffusion stable-diffusion-diffusers Chinese

预印本库:

arxiv:2112.10752 arxiv:2209.02970

许可:

creativeml-openrail-m

模型介绍文件清单

英文

Taiyi-Stable-Diffusion-1B-Chinese-v0.1

Main Page: Fengshenbang
Github: Fengshenbang-LM

简介 Brief Introduction

首个开源的中文Stable Diffusion模型，基于0.2亿筛选过的中文图文对训练。

在线体验 Gradio Web UI

可以在 Taiyi-Stable-Diffusion-Chinese 体验我们的模型。

我们支持一个 Gradio 的Web界面来运行 Taiyi-Stable-Diffusion-1B-Chinese-v0.1： Taiyi-Stable-Diffusion-Chinese

简介 Brief Introduction

首个开源的中英双语Stable Diffusion模型，基于0.2亿筛选过的中文图文对训练。

模型分类 Model Taxonomy

需求 Demand	任务 Task	系列 Series	模型 Model	参数 Parameter	额外 Extra
特殊 Special	多模态 Multimodal	太乙 Taiyi	Stable Diffusion	1B	Chinese

模型信息 Model Information

我们将 Noah-Wukong 数据集(100M)和 Zero 数据集(23M)用作预训练的数据集，先用 IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese 对这两个数据集的图文对相似性进行打分，取CLIP Score大于0.2的图文对作为我们的训练集。我们使用 IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese 作为初始化的text encoder，冻住 stable-diffusion-v1-4 ( 论文 )模型的其他部分，只训练text encoder，以便保留原始模型的生成能力且实现中文概念的对齐。该模型目前在0.2亿图文对上训练了一个epoch。我们在 32 x A100 训练了大约100小时。该版本只是一个初步的版本，我们将持续优化并开源后续模型，欢迎交流。

我们使用 Noah-Wukong (100M) 和 Zero (23M) 作为我们的数据集，并且将根据 IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese 对图像和文本对进行打分，取CLIP Score大于0.2的作为我们的训练集。我们使用 IDEA-CCNL/Taiyi-CLIP-RoBERTa-102M-ViT-L-Chinese 作为我们的初始文本编码器。为了保持稳定扩散的强大生成能力，同时实现中文概念与图像的对齐，我们只训练文本编码器，并冻结 stable-diffusion-v1-4 ( paper )模型的其他部分。该模型基于32 x A100进行训练，训练时间约为100小时。这个版本是一个初步版本，我们将持续更新并开源。欢迎交流！

Result

Basic Prompt

12330321 12331321 12332321

铁马冰河入梦来，3D绘画。	飞流直下三千尺，油画。	女孩背影，日落，唯美插画。

Advanced Prompt

12333321 12334321 12335321

铁马冰河入梦来，概念画，科幻，玄幻，3D	中国海边城市，科幻，未来感，唯美，插画。	那人却在灯火阑珊处，色彩艳丽，古风，资深插画师作品，桌面高清壁纸。

使用 Usage

全精度 Full precision

from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1").to("cuda")

prompt = '飞流直下三千尺，油画'
image = pipe(prompt, guidance_scale=7.5).images[0]  
image.save("飞流.png")

半精度 Half precision FP16 (CUDA)

添加 torch_dtype=torch.float16 和 device_map="auto" 可以快速加载 FP16 的权重，以加快推理速度。更多信息见 the optimization docs 。

# !pip install git+https://github.com/huggingface/accelerate
import torch
from diffusers import StableDiffusionPipeline
torch.backends.cudnn.benchmark = True
pipe = StableDiffusionPipeline.from_pretrained("IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1", torch_dtype=torch.float16)
pipe.to('cuda')

prompt = '飞流直下三千尺，油画'
image = pipe(prompt, guidance_scale=7.5).images[0]  
image.save("飞流.png")

引用 Citation

如果您在您的工作中使用了我们的模型，可以引用我们的总论文：

如果您在您的工作中使用了我们的资源，请引用我们的 paper ：

@article{fengshenbang,
  author    = {Jiaxing Zhang and Ruyi Gan and Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen},
  title     = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
  journal   = {CoRR},
  volume    = {abs/2209.02970},
  year      = {2022}
}

也可以引用我们的网站：

也可以引用我们的 website ：

@misc{Fengshenbang-LM,
  title={Fengshenbang-LM},
  author={IDEA-CCNL},
  year={2021},
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}

作者:

Fengshenbang-LM

数据集大小:

8.93 GB

Taiyi-Stable-Diffusion-1B-Chinese-v0.1

简介 Brief Introduction

在线体验 Gradio Web UI

简介 Brief Introduction

模型分类 Model Taxonomy

模型信息 Model Information

Result

使用 Usage

全精度 Full precision

半精度 Half precision FP16 (CUDA)

使用手册 Handbook for Taiyi

怎样微调 How to finetune

webui配置 Configure webui

DreamBooth

引用 Citation