LCM 的全称是 Latent Consistency Models(潜在一致性模型),由清华大学交叉信息研究院的研究者们构建。通过一些创新性的方法,LCM 只用少数的几步推理就能生成高分辨率图像,将主流文生图模型的效率提高 5-10 倍,所以能呈现出实时的效果。
目前LCM只可以使用一个模型Dreamshaper_v7,基于SD版本Dreamshaper微调而来的。
https://huggingface.co/SimianLuo/LCM_Dreamshaper_v7https://huggingface.co/SimianLuo/LCM_Dreamshaper_v7
GitHub - luosiallen/latent-consistency-model: Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferenceLatent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference - GitHub - luosiallen/latent-consistency-model: Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inferencehttps://github.com/luosiallen/latent-consistency-model
pip install --upgrade diffusers # make sure to use at least diffusers >= 0.22
pip install transformers accelerate
文生图
from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7")
# To save GPU memory, torch.float16 can be used, but it may compromise image quality.
pipe.to(torch_device="cuda", torch_dtype=torch.float32)
prompt = "Self-portrait oil painting, a beautiful cyborg with golden hair, 8k"
# Can be set to 1~50 steps. LCM support fast inference even <= 4 steps. Recommend: 1~8 steps.
num_inference_steps = 4
images = pipe(prompt=prompt, num_inference_steps=num_inference_steps, guidance_scale=8.0, lcm_origin_steps=50, output_type="pil").images
images[0].save("image.png")
图生图
from diffusers import AutoPipelineForImage2Image
import torch
import PIL
pipe = AutoPipelineForImage2Image.from_pretrained("SimianLuo/LCM_Dreamshaper_v7")
# To save GPU memory, torch.float16 can be used, but it may compromise image quality.
pipe.to(torch_device="cuda", torch_dtype=torch.float32)
prompt = "High altitude snowy mountains"
image = PIL.Image.open("./snowy_mountains.png")
# Can be set to 1~50 steps. LCM support fast inference even <= 4 steps. Recommend: 1~8 steps.
num_inference_steps = 4
images = pipe(prompt=prompt, image=image, num_inference_steps=num_inference_steps, guidance_scale=8.0).images
images[0].save("image.png")
https://huggingface.co/spaces/SimianLuo/Latent_Consistency_Modelhttps://huggingface.co/spaces/SimianLuo/Latent_Consistency_Model
OpenXLab浦源 - 应用中心应用中心提供应用托管的服务,用户只需遵循平台规范,通过简单的前端封装组件(Gradio)即可构建模型推理应用演示demo,应用中心提供免费应用部署的能力,普通用户也可在应用中心中交互式体验模型的能力,更好帮助用户寻找想要的学术模型或应用服务。通过前端封装组件和平台的SDK工具,帮助AI开发者简单快速构建人工智能应用。https://openxlab.org.cn/apps/detail/Latent-Consistency-Model/Latent-Consistency-Model
SD WebUI 的 LCM 插件源码
GitHub - 0xbitches/sd-webui-lcm: Latent Consistency Model for AUTOMATIC1111 Stable Diffusion WebUILatent Consistency Model for AUTOMATIC1111 Stable Diffusion WebUI - GitHub - 0xbitches/sd-webui-lcm: Latent Consistency Model for AUTOMATIC1111 Stable Diffusion WebUIhttps://github.com/0xbitches/sd-webui-lcm选择“Extensions”->“Install from URL”,安装LCM插件。
生成的图片将会保存到outputs/txt2img-images/LCM
LCM插件提供了txt2img、img2img、vid2vid三个选项卡。
Img2Img和Vid2Vid的输出高度和宽度将与输入相同,目前不可更改。
生成的视频将保存到outputs/LCM-vid2vid
LCM每个模型都需要单独蒸馏,以获得潜在的一致性蒸馏。LCM-LoRA的核心思想是只训练几个适配器层,在这种情况下,适配器是LoRA。这样,我们不必训练完整模型,并保持可训练参数数量的管理。然后,由此产生的LoRA可以应用于模型的任何微调版本,而无需单独蒸馏它们。此外,LoRA可以应用于img2img, ControlNet/T2I-Adapter, inpainting, AnimateDiff等。LCM-LoRA还可以与其他LoRA相结合,只需几个步骤(4-8)即可生成样式图像。
LCM-LoRAs 可用于 stable-diffusion-v1-5, stable-diffusion-xl-base-1.0, 和 SSD-1B 模型。
https://huggingface.co/latent-consistency/lcm-lora-sdv1-5
https://huggingface.co/latent-consistency/lcm-lora-sdxl
https://huggingface.co/latent-consistency/lcm-lora-ssd-1b
使用时我们可以将推理步骤的数量减少到仅2到8个步骤,将CFG Scale值介于1.0和2.0之间。调整Lora的权重,使其更接近原本的效果。