远洋之帆

AIGC在营销图片生成技术综述

基于文本生成素材

imagen

分析用户输入的文本并使用T5-XXL进行编码。嵌入在 AI 中的文本首先被转换为分辨率为64x64像素的小图像。Imagen进一步利用文本条件超分辨率扩散模型对图像进行64×64的上采样，然后这个图像继续增长并最终形成。

Imagen 的开发者谷歌研究的大脑团队表示，基于变压器和图像扩散模型，Imagen实现了前所未有的真实感。谷歌声称，对比其它模型，在图像保真度和图像-文本匹配方面，人类评估者更喜欢 Imagen。

不过，谷歌也表示，Imagen 是在从网络上抓取的数据集上进行训练的，虽然已经过滤了很多不良内容如色情图像、污秽语言等，但仍有大量不当的内容数据集，因此也会存在种族主义诽谤和有害的社会刻板印象。

StableDiffusion

从名字Stable Diffusion就可以看出，这个主要采用的扩散模型（Diffusion Model）。

简单来说，扩散模型就是去噪自编码器的连续应用，逐步生成图像的过程。

一般所言的扩散，是反复在图像中添加小的、随机的噪声。而扩散模型则与这个过程相反——将噪声生成高清图像。训练的神经网络通常为U-net。

不过因为模型是直接在像素空间运行，导致扩散模型的训练、计算成本十分昂贵。

基于这样的背景下，Stable Diffusion主要分两步进行。

首先，使用编码器将图像x压缩为较低维的潜在空间表示z（x）。

其中上下文（Context）y，即输入的文本提示，用来指导x的去噪。

它与时间步长t一起，以简单连接和交叉两种方式，注入到潜在空间表示中去。

随后在z（x）基础上进行扩散与去噪。换言之，就是模型并不直接在图像上进行计算，从而减少了训练时间、效果更好。

值得一提的是，Stable DIffusion的上下文机制非常灵活，y不光可以是图像标签，就是蒙版图像、场景分割、空间布局，也能够相应完成。

素材多样化生成

基于文本ps多样化

DreamBooth

用户可以给定3-5张自己随意拍摄的某一物体的图片，就能得到不同背景下的该物体的新颖再现，同时又保留了其关键特征。

当然，作者也表示，这种方法并不局限于某个模型，如果DALL·E2经过一些调整，同样能实现这样的功能。

具体到方法上，DreamBooth采用了给物体加上“特殊标识符”的方法。

也就是说，原本图像生成模型收到的指令只是一类物体，例如[cat]、[dog]等，但现在DreamBooth会在这类物体前加上一个特殊标识符，变成[V][物体类别]。

以下图为例，将用户上传的三张狗子照片和相应的类名（如“狗”）作为输入信息，得到一个经过微调的文本-图像扩散模型。

该扩散模型用“a [V] dog”来特指用户上传图片中的狗子，再把其带入文字描述中，生成特定的图像，其中[V]就是那个特殊标识符。

至于为什么不直接用[V]来指代整个[特定物体]？

作者表示，受限于输入照片的数量，模型无法很好地学习到照片中物体的整体特征，反而可能出现过拟合。

因此这里采用了微调的思路，整体上仍然基于AI已经学到的[物体类别]特征，再用[V]学到的特殊特征来修饰它。

以生成一只白色的狗为例，这里模型会通过[V]来学习狗的颜色（白色）、体型等个性化细节，加上模型在[狗]这个大的类别中学到的狗的共性，就能生成更多合理又不失个性的白狗的照片。

为了训练这个微调的文本-图像扩散模型，研究人员首先根据给定的文本描述生成低分辨率图像，这时生成的图像中狗子的形象是随机的。

然后再应用超分辨率的扩散模型进行替换，把随机图像换成用户上传的特定狗子。

instructPix2Pix

InstructPix2Pix整合了目前较为成熟的两个大规模预训练模型：语言模型GPT-3和文本图像生成模型Stable Diffusion，生成了一个专用于图像编辑训练的数据集，随后训练了一个条件引导型的扩散模型来完成这一任务。此外，InstructPix2Pix模型可以在几秒钟内快速完成图像编辑操作，这进一步提高了InstructPix2Pix的可用性和实用性。

InstructPix2Pix模型的整体构建流程分为两大部分：（1）生成一个专用于图像编辑任务的数据集。（2）使用生成的数据集训练一个条件扩散模型，该模型可以按照人类的指令对目标图像进行各种形式的编辑操作，例如替换物体、更改图像本身的风格、修改图像的背景环境等等。

作者在InstructPix2Pix中整合了两个大规模预训练模型：语言模型GPT-3和文本图像模型Stable Diffusion，同时利用这两个模型中蕴含的知识构建了一个多模态训练数据集，该数据集主要包含了由文本编辑指令和编辑前后对应图像构成的图像对。在构建过程中，作者首先从文本编辑指令出发，生成成对的图像描述。随后再根据这些描述生成对应的成对图像构成训练样本。

1. 生成成对的图像描述

在这一过程中，需要先给定一个图像文本描述，例如“一个女孩骑马的照片”，如上图（a）中所示，随后需要根据该文本描述生成一些合理的编辑指令，例如“让一个女孩骑龙”，更合理一点的描述为“一个女孩骑龙的照片”，这一操作可以通过GPT-3类似的文本大模型完成。需要注意的是，这些操作完全在文本域中进行，这样做可以生成大量的、多样性的编辑指令，同时能够保证图像变化和文本指令之间的对应关系。

具体来说，作者对GPT-3进行了专门的微调，首先收集了一个规模相对较小的人工编辑三元组数据集，三元组包含（1）输入的图像描述，（2）编辑指令，（3）输出的图像描述，数据集详细介绍如下表所示。

首先收集了700条图像描述样本，然后手动编写了编辑指令和输出图像描述，然后使用这700条样本对GPT-3模型进行微调，微调后的模型可以自行生成详细的训练样本，上表非常鲜明的展示了作者手动生成的样本和GPT-3随后生成样本的对比。

在得到成对的编辑指令后，作者使用文本图像模型Stable Diffusion将这两个文本提示（即编辑前和编辑后）转换为一对相应的图像，如上图（b）所示。然而这一过程仍然面临一个重大挑战：目前的文本到图像模型无法保证图像内容身份信息的一致性，即使在输入的条件提示变化非常小的情况下。

例如，我们为模型指定两个非常相似的文本提示：“一张猫的照片”和“一张黑猫的照片”，模型可能会产生两只截然不同的猫的图像，这对本文图像编辑的目的来说是不合理的。为了解决这一问题，作者想到使用这些成对数据来训练模型编辑图像，而不是遵循这些模型原本的生成模式去生成随机图像。

作者使用了最近新提出的Prompt-to-Prompt方法[3]来完成操作，该方法可以针对一个输入文本生成多代近似的图像，且这些图像彼此之间含有相同的身份信息，Prompt-to-Prompt通过在去噪过程中使用交互注意力权重来实现。

InstructPix2Pix的建模本质是从隐空间扩散模型（Latent Diffusion）演变而来，Latent Diffusion通过在带有编码器和解码器的预训练变分自动编码器的隐空间中运行来提高扩散模型的效率和质量。对于一个图像，扩散过程将噪声添加到编码的隐层向量中，产生一个噪声隐变量，其中噪声等级随时间步数而增加。然后训练一个网络，它可以预测在给定的图像条件和文本指令条件下添加到噪声隐变量中的噪声信息，然后通过以下目标函数来优化模型：

之前的工作[4]表明微调大型图像扩散模型往往比从头训练模型以完成图像翻译任务效果更好，尤其是在配对训练数据有限的情况下。因此，本文作者使用预训练的Stable Diffusion对模型进行初始化。为了赋予InstructPix2Pix图像编辑的能力，作者在模型的第一个卷积层中增加了额外的条件输入通道。

为了进一步提高图像生成效果以及模型对输入条件的遵循程度，作者在InstructPix2Pix中也引入了Classifier-free引导策略。Classifier-free扩散引导是一种权衡扩散模型生成的样本质量和多样性的方法。其中隐式分类器会将更高的可能性分配给条件，以提高生成图像的视觉质量并使采样图像更好地与输入条件相符合。Classifier-free引导的训练需要同时联合训练有条件和无条件去噪的扩散模型，并在推理时结合两个分数进行估计。

对于本文的任务，作者设计了一个评分网络，其中有两个条件：输入图像和文本指令。在训练过程中，使InstructPix2Pix能够针对两个或任一条件输入进行有条件或无条件去噪。为此，作者引入了两个指导尺度和，可以对其进行调整以权衡生成的样本与输入图像的遵循程度以及它们与编辑指令的遵循程度，评分网络的分数估计如下：

在下图的“将大卫变成半机械人”的例子中显示了这两个参数对生成样本的影响。控制与输入图像的相似性，而控制与编辑指令的一致性。

基于3d建模多样化

单图生成3d模型

Point-E

Point-E 不输出传统意义上的 3D 图像，它会生成点云，或空间中代表 3D 形状的离散数据点集。Point-E 中的 E 是「效率」的缩写，表示其比以前的 3D 对象生成方法更快。不过从计算的角度来看，点云更容易合成，但它们无法捕获对象的细粒度形状或纹理 —— 这是目前 Point-E 的一个关键限制。

为了解决这一问题，OpenAI 团队训练了一个额外的人工智能系统来将 Point-E 的点云转换为网格。

Point-E 架构及运行原理

在独立的网格生成模型之外，Point-E 主要由两个模型组成：文本到图像模型和图像到 3D 模型。文本到图像模型类似于 OpenAI 自家的 DALL-E 2 和 Stable Diffusion 等生成模型系统，在标记图像上进行训练以理解单词和视觉概念之间的关联。在图像生成之后，图像到 3D 模型被输入一组与 3D 对象配对的图像，训练出在两者之间有效转换的能力。

不是训练单个生成模型，直接生成以文本为条件的点云，而是将生成过程分为三个步骤。

首先，生成一个以文本标题为条件的综合视图。

接下来，生成⼀个基于合成视图的粗略点云（1,024 个点）。

最后，生成了⼀个以低分辨率点云和合成视图为条件的精细点云（4,096 个点）。

在数百万个3D模型上训练模型后，我们发现数据集的数据格式和质量差异很大，这促使我们开发各种后处理步骤，以确保更高的数据质量。

为了将所有的数据转换为⼀种通用格式，我们使用Blender从20个随机摄像机角度，将每个3D模型渲染为RGBAD图像（Blender支持多种3D格式，并带有优化的渲染引擎）。

对于每个模型，Blender脚本都将模型标准化为边界立方体，配置标准照明设置，最后使用Blender的内置实时渲染引擎，导出RGBAD图像。

然后，使用渲染将每个对象转换为彩色点云。首先，通过计算每个RGBAD图像中每个像素的点，来为每个对象构建⼀个密集点云。这些点云通常包含数十万个不均匀分布的点，因此我们还使用最远点采样，来创建均匀的4K点云。

通过直接从渲染构建点云，我们能够避免直接从3D网格中采样可能出现的各种问题，对模型中包含的点进行取样，或处理以不寻常的文件格式存储的三维模型。

最后，我们采用各种启发式方法，来减少数据集中低质量模型的频率。

首先，我们通过计算每个点云的SVD来消除平面对象，只保留那些最小奇异值高于某个阈值的对象。

接下来，我们通过CLIP特征对数据集进行聚类（对于每个对象，我们对所有渲染的特征进行平均）。

我们发现，一些集群包含许多低质量的模型类别，而其他集群则显得更加多样化或可解释。

我们将这些集群分到几个不同质量的bucket中，并使用所得bucket的加权混合作为我们的最终数据集。

多图合成3d模型

Nerf

NeRF 作为 ECCV2020 Best Paper Honorable Mention，影响力巨大。如今各大CV&CG会议中都是此类工作（又一个大坑）。这个 method 使用隐式表达，以 2d posed images 为监督完成 novel view synthesis。通俗来说，就是用多张 2d 图片隐式重建三维场景，其展示的生成效果让人十分震撼。

希望本文能让大家快速了解这项工作~

原文地址： NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
项目主页： NeRF: Neural Radiance Fields

pipeline

NeRF训练管线

首先NeRF的想法是将三维场景隐式存储在神经网络中，我们只需要通过输入一个相机位姿，就可以获得场景图片。NeRF将场景建模成一个连续的5D辐射场（其实感觉就可以理解为隐式的体素描述）。

已知场景中点的位置 (x,y,z) 和观察方向 (θ,ϕ) ，神经网络 FΘ 会输出一个 (c,σ) 表示该方向的自发光颜色 c 和该点体素密度 σ 。然后使用 classical volume rendering 的渲染方程：

C(r)=∫tntfT(t)σ(r(t))c(r(t),d)dt, where T(t)=exp⁡(−∫tntσ(r(s))ds)

The expected color C(r) of camera ray r(t) = o + td with near and far bounds tn and tf
The function T(t) denotes the accumulated transmittance along the ray from tn to t, i.e., the probability that the ray travels from tn to t without hitting any other particle.

我们就可以得到输入相机位姿条件下的视角图片，然后和 ground truth 做损失即可完成可微优化。

其实思路总结下来十分简单：

用 network 存体素信息：(x,y,z,θ,ϕ)→(c,σ)

然后用体素渲染方程获得生成视角图片：光线采样+积分

最后与原视角图片计算损失更新网络

需要注意的是：体素在不同方向的自发光颜色是不一样的（view-dependent），具体效果参考原文 Fig.3 和 Fig.4 。view-dependent 可以表示各向异性的光学属性。

See Fig. 3 for an example of how our method uses the input viewing direction to represent non-Lambertian effects. As shown in Fig. 4, a model trained without view dependence (only x as input) has difficulty representing specularities.
一点疑问：如果体素密度 view-dependent 会怎么样？

Numerical estimation of volume rendering

在上一节我们介绍了NeRF生成图片的规则。注意到体素渲染方程是连续形式，而在实际神经网络训练过程中无法完成的。因此我们需要将其改为离散形式进行近似计算（Quadrature积分法）：

C^(r)=∑i=1NTi(1−exp⁡(−σiδi))ci, where Ti=exp⁡(−∑j=1i−1σjδj)

其中：

ti∼U[tn+i−1N(tf−tn),tn+iN(tf−tn)]

δi=ti+1−ti

通过将视线路径均分成 N 段，然后在每一段均匀地随机采样体素用于渲染计算。

文末给出了离散形式的证明

Hierarchical volume sampling

虽然前文我们使用离散的近似积分来进行体素渲染，但是在场景中难免会存在 free space 和 occluded regions 这种应该对渲染结果无贡献的区间，或者说没有意义的。为了更有效地采样，本文采用分级表征渲染的思想[1]提高渲染效率，即通过同时优化两个神经网络："coarse"和"fine"。

作者先使用分层采样得到 Nc 个点，通过 coarse 的渲染方程的计算：

C^c(r)=∑i=1Ncwici,wi=Ti(1−exp⁡(−σiδi))

对 ωi 进行归一化 w^i=wi/∑j=1Ncwj 得到分段常数概率密度函数，然后通过逆变换采样（inverse transform sampling）获得 Nf 个点，添加至原 Nc 个点中用于 fine 渲染。

逆变换采样：在分布 p 的 CDF 值域上均匀采样与原分布 p 中的采样同分布

通过第二次采样，我们所得到的采样点则会更多得使用对颜色计算贡献更有意义的体素进行计算。

最后的损失函数如下：

L=∑r∈R[‖C^c(r)−C(r)‖22+‖C^f(r)−C(r)‖22]

Positional encoding

虽然前文描述的“隐式表示+体素渲染”十分美好，但是我们通过下图（No Position Encoding）可知它的生成图片十分模糊（可以理解为一种高频信息丢失）。前人工作[2]指出神经网络倾向于学习低频信息，而NeRF需要重建高清的场景（对场景overfitting），所以我们需要让模型关注场景的高频细节。于是我们将位置向量 x 和方向向量 d 转化为高频变量：

γ(p)=(sin⁡(20πp),cos⁡(20πp),⋯,sin⁡(2L−1πp),cos⁡(2L−1πp))

可以理解为，这种表示方法，即便两个点在原空间中的距离很近，很难分辨，但通过Positional encoding后，我们还是可以很轻松的分辨两个点！

In our experiments, we set L = 10 for γ(x) and L = 4 for γ(d).

然后将Positional encoding后的 (x,y,z) 和 (θ,ϕ) 作为输入就可以生成更加清晰的图片。

合图多样化

Demo

文本生成图片

from diffusers import StableDiffusionPipeline
import torch

model_id = "dreamlike-art/dreamlike-photoreal-2.0"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "blender 3d model, a stand full body cute fluen ant doll with blue and white eyes of technology style,bright cinematic lighting, gopro, fisheye lens  closeup,highly detailed, digital painting, artstation, concept art, smooth,  volumetric light"
image = pipe(prompt).images[0]

image.save("./result.jpg")

" a ant doll , blue and white eyes,technology style "

prompt文本增加多样性

import argparse
import hashlib
import itertools
import math
import os
from pathlib import Path
from typing import Optional

import torch
import torch.nn.functional as F
import torch.utils.checkpoint
from torch.utils.data import Dataset

from accelerate import Accelerator
from accelerate.logging import get_logger
from accelerate.utils import set_seed
from diffusers import AutoencoderKL, DDPMScheduler, StableDiffusionPipeline, UNet2DConditionModel
from diffusers.optimization import get_scheduler
from huggingface_hub import HfFolder, Repository, whoami
from PIL import Image
from torchvision import transforms
from tqdm.auto import tqdm
from transformers import CLIPTextModel, CLIPTokenizer


logger = get_logger(__name__)


def parse_args(input_args=None):
    parser = argparse.ArgumentParser(description="Simple example of a training script.")
    parser.add_argument(
        "--pretrained_model_name_or_path",
        type=str,
        default=None,
        required=True,
        help="Path to pretrained model or model identifier from huggingface.co/models.",
    )
    parser.add_argument(
        "--revision",
        type=str,
        default=None,
        required=False,
        help="Revision of pretrained model identifier from huggingface.co/models.",
    )
    parser.add_argument(
        "--tokenizer_name",
        type=str,
        default=None,
        help="Pretrained tokenizer name or path if not the same as model_name",
    )
    parser.add_argument(
        "--instance_data_dir",
        type=str,
        default=None,
        required=True,
        help="A folder containing the training data of instance images.",
    )
    parser.add_argument(
        "--class_data_dir",
        type=str,
        default=None,
        required=False,
        help="A folder containing the training data of class images.",
    )
    parser.add_argument(
        "--instance_prompt",
        type=str,
        default=None,
        help="The prompt with identifier specifying the instance",
    )
    parser.add_argument(
        "--class_prompt",
        type=str,
        default=None,
        help="The prompt to specify images in the same class as provided instance images.",
    )
    parser.add_argument(
        "--with_prior_preservation",
        default=False,
        action="store_true",
        help="Flag to add prior preservation loss.",
    )
    parser.add_argument("--prior_loss_weight", type=float, default=1.0, help="The weight of prior preservation loss.")
    parser.add_argument(
        "--num_class_images",
        type=int,
        default=100,
        help=(
            "Minimal class images for prior preservation loss. If not have enough images, additional images will be"
            " sampled with class_prompt."
        ),
    )
    parser.add_argument(
        "--output_dir",
        type=str,
        default="text-inversion-model",
        help="The output directory where the model predictions and checkpoints will be written.",
    )
    parser.add_argument("--seed", type=int, default=None, help="A seed for reproducible training.")
    parser.add_argument(
        "--resolution",
        type=int,
        default=512,
        help=(
            "The resolution for input images, all the images in the train/validation dataset will be resized to this"
            " resolution"
        ),
    )
    parser.add_argument(
        "--center_crop", action="store_true", help="Whether to center crop images before resizing to resolution"
    )
    parser.add_argument(
        "--use_filename_as_label", action="store_true", help="Uses the filename as the image labels instead of the instance_prompt, useful for regularization when training for styles with wide image variance"
    )
    parser.add_argument(
        "--use_txt_as_label", action="store_true", help="Uses the filename.txt file's content as the image labels instead of the instance_prompt, useful for regularization when training for styles with wide image variance"
    )
    parser.add_argument("--train_text_encoder", action="store_true", help="Whether to train the text encoder")
    parser.add_argument(
        "--train_batch_size", type=int, default=4, help="Batch size (per device) for the training dataloader."
    )
    parser.add_argument(
        "--sample_batch_size", type=int, default=4, help="Batch size (per device) for sampling images."
    )
    parser.add_argument("--num_train_epochs", type=int, default=1)
    parser.add_argument(
        "--max_train_steps",
        type=int,
        default=None,
        help="Total number of training steps to perform.  If provided, overrides num_train_epochs.",
    )
    parser.add_argument(
        "--gradient_accumulation_steps",
        type=int,
        default=1,
        help="Number of updates steps to accumulate before performing a backward/update pass.",
    )
    parser.add_argument(
        "--gradient_checkpointing",
        action="store_true",
        help="Whether or not to use gradient checkpointing to save memory at the expense of slower backward pass.",
    )
    parser.add_argument(
        "--learning_rate",
        type=float,
        default=5e-6,
        help="Initial learning rate (after the potential warmup period) to use.",
    )
    parser.add_argument(
        "--scale_lr",
        action="store_true",
        default=False,
        help="Scale the learning rate by the number of GPUs, gradient accumulation steps, and batch size.",
    )
    parser.add_argument(
        "--lr_scheduler",
        type=str,
        default="constant",
        help=(
            'The scheduler type to use. Choose between ["linear", "cosine", "cosine_with_restarts", "polynomial",'
            ' "constant", "constant_with_warmup"]'
        ),
    )
    parser.add_argument(
        "--lr_warmup_steps", type=int, default=500, help="Number of steps for the warmup in the lr scheduler."
    )
    parser.add_argument(
        "--use_8bit_adam", action="store_true", help="Whether or not to use 8-bit Adam from bitsandbytes."
    )
    parser.add_argument("--adam_beta1", type=float, default=0.9, help="The beta1 parameter for the Adam optimizer.")
    parser.add_argument("--adam_beta2", type=float, default=0.999, help="The beta2 parameter for the Adam optimizer.")
    parser.add_argument("--adam_weight_decay", type=float, default=1e-2, help="Weight decay to use.")
    parser.add_argument("--adam_epsilon", type=float, default=1e-08, help="Epsilon value for the Adam optimizer")
    parser.add_argument("--max_grad_norm", default=1.0, type=float, help="Max gradient norm.")
    parser.add_argument("--push_to_hub", action="store_true", help="Whether or not to push the model to the Hub.")
    parser.add_argument("--hub_token", type=str, default=None, help="The token to use to push to the Model Hub.")
    parser.add_argument(
        "--hub_model_id",
        type=str,
        default=None,
        help="The name of the repository to keep in sync with the local `output_dir`.",
    )
    parser.add_argument(
        "--logging_dir",
        type=str,
        default="logs",
        help=(
            "[TensorBoard](https://www.tensorflow.org/tensorboard) log directory. Will default to"
            " *output_dir/runs/**CURRENT_DATETIME_HOSTNAME***."
        ),
    )
    parser.add_argument(
        "--log_with",
        type=str,
        default="tensorboard",
        choices=["tensorboard", "wandb"]
    )
    parser.add_argument(
        "--mixed_precision",
        type=str,
        default="no",
        choices=["no", "fp16", "bf16"],
        help=(
            "Whether to use mixed precision. Choose"
            "between fp16 and bf16 (bfloat16). Bf16 requires PyTorch >= 1.10."
            "and an Nvidia Ampere GPU."
        ),
    )
    parser.add_argument("--local_rank", type=int, default=-1, help="For distributed training: local_rank")
    parser.add_argument("--save_model_every_n_steps", type=int)
    parser.add_argument("--auto_test_model", action="store_true", help="Whether or not to automatically test the model after saving it")
    parser.add_argument("--test_prompt", type=str, default="A photo of a cat", help="The prompt to use for testing the model.")
    parser.add_argument("--test_prompts_file", type=str, default=None, help="The file containing the prompts to use for testing the model.example: test_prompts.txt, each line is a prompt")
    parser.add_argument("--test_negative_prompt", type=str, default="", help="The negative prompt to use for testing the model.")
    parser.add_argument("--test_seed", type=int, default=42, help="The seed to use for testing the model.")
    parser.add_argument("--test_num_per_prompt", type=int, default=1, help="The number of images to generate per prompt.")
    
    if input_args is not None:
        args = parser.parse_args(input_args)
    else:
        args = parser.parse_args()

    env_local_rank = int(os.environ.get("LOCAL_RANK", -1))
    if env_local_rank != -1 and env_local_rank != args.local_rank:
        args.local_rank = env_local_rank

    if args.instance_data_dir is None:
        raise ValueError("You must specify a train data directory.")

    if args.with_prior_preservation:
        if args.class_data_dir is None:
            raise ValueError("You must specify a data directory for class images.")
        if args.class_prompt is None:
            raise ValueError("You must specify prompt for class images.")

    return args

# turns a path into a filename without the extension
def get_filename(path):
    return path.stem

def get_label_from_txt(path):
    txt_path = path.with_suffix(".txt") # get the path to the .txt file
    if txt_path.exists():
        with open(txt_path, "r") as f:
            return f.read()
    else:
        return ""

class DreamBoothDataset(Dataset):
    """
    A dataset to prepare the instance and class images with the prompts for fine-tuning the model.
    It pre-processes the images and the tokenizes prompts.
    """

    def __init__(
        self,
        instance_data_root,
        instance_prompt,
        tokenizer,
        class_data_root=None,
        class_prompt=None,
        size=512,
        center_crop=False,
        use_filename_as_label=False,
        use_txt_as_label=False,
    ):
        self.size = size
        self.center_crop = center_crop
        self.tokenizer = tokenizer

        self.instance_data_root = Path(instance_data_root)
        if not self.instance_data_root.exists():
            raise ValueError("Instance images root doesn't exists.")

        self.instance_images_path = list(self.instance_data_root.glob("*.jpg")) + list(self.instance_data_root.glob("*.png"))
        self.num_instance_images = len(self.instance_images_path)
        self.instance_prompt = instance_prompt
        self.use_filename_as_label = use_filename_as_label
        self.use_txt_as_label = use_txt_as_label
        self._length = self.num_instance_images

        if class_data_root is not None:
            self.class_data_root = Path(class_data_root)
            self.class_data_root.mkdir(parents=True, exist_ok=True)
            self.class_images_path = list(self.class_data_root.glob("*.jpg")) + list(self.class_data_root.glob("*.png"))
            self.num_class_images = len(self.class_images_path)
            self._length = max(self.num_class_images, self.num_instance_images)
            self.class_prompt = class_prompt
        else:
            self.class_data_root = None

        self.image_transforms = transforms.Compose(
            [
                transforms.Resize(size, interpolation=transforms.InterpolationMode.BILINEAR),
                transforms.CenterCrop(size) if center_crop else transforms.RandomCrop(size),
                transforms.ToTensor(),
                transforms.Normalize([0.5], [0.5]),
            ]
        )

    def __len__(self):
        return self._length

    def __getitem__(self, index):
        example = {}
        path = self.instance_images_path[index % self.num_instance_images]
        prompt = get_filename(path) if self.use_filename_as_label else self.instance_prompt
        prompt = get_label_from_txt(path) if self.use_txt_as_label else prompt
        
        print("prompt", prompt)
        
        instance_image = Image.open(path)
        if not instance_image.mode == "RGB":
            instance_image = instance_image.convert("RGB")
        example["instance_images"] = self.image_transforms(instance_image)
        example["instance_prompt_ids"] = self.tokenizer(
            prompt,
            padding="do_not_pad",
            truncation=True,
            max_length=self.tokenizer.model_max_length,
        ).input_ids

        if self.class_data_root:
            class_image = Image.open(self.class_images_path[index % self.num_class_images])
            if not class_image.mode == "RGB":
                class_image = class_image.convert("RGB")
            example["class_images"] = self.image_transforms(class_image)
            example["class_prompt_ids"] = self.tokenizer(
                self.class_prompt,
                padding="do_not_pad",
                truncation=True,
                max_length=self.tokenizer.model_max_length,
            ).input_ids

        return example


class PromptDataset(Dataset):
    "A simple dataset to prepare the prompts to generate class images on multiple GPUs."

    def __init__(self, prompt, num_samples):
        self.prompt = prompt
        self.num_samples = num_samples

    def __len__(self):
        return self.num_samples

    def __getitem__(self, index):
        example = {}
        example["prompt"] = self.prompt
        example["index"] = index
        return example


def get_full_repo_name(model_id: str, organization: Optional[str] = None, token: Optional[str] = None):
    if token is None:
        token = HfFolder.get_token()
    if organization is None:
        username = whoami(token)["name"]
        return f"{username}/{model_id}"
    else:
        return f"{organization}/{model_id}"

def test_model(folder, args):
    if args.test_prompts_file is not None:
        with open(args.test_prompts_file, "r") as f:
            prompts = f.read().splitlines()
    else:
        prompts = [args.test_prompt]
    
    test_path = os.path.join(folder, "test")
    if not os.path.exists(test_path):
        os.makedirs(test_path)
    
    print("Testing the model...")
    from diffusers import DDIMScheduler
    
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    torch_dtype = torch.float16 if device.type == "cuda" else torch.float32
    pipeline = StableDiffusionPipeline.from_pretrained(
        folder,
        torch_dtype=torch_dtype,
        safety_checker=None,
        load_in_8bit=True,
        scheduler = DDIMScheduler(
            beta_start=0.00085,
            beta_end=0.012,
            beta_schedule="scaled_linear",
            clip_sample=False,
            set_alpha_to_one=False,
        ),
    )
    pipeline.set_progress_bar_config(disable=True)
    pipeline.enable_attention_slicing()
    pipeline = pipeline.to(device)

    torch.manual_seed(args.test_seed)
    with torch.autocast('cuda'): 
        for prompt in prompts:
            print(f"Generating test images for prompt: {prompt}")
            test_images = pipeline(
                prompt=prompt,
                width=512,
                height=512,
                negative_prompt=args.test_negative_prompt,
                num_inference_steps=30, 
                num_images_per_prompt=args.test_num_per_prompt,
            ).images
            
            for index, image in enumerate(test_images):
                image.save(f"{test_path}/{prompt}_{index}.png")
    
    del pipeline
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
        
    print(f"Test completed.The examples are saved in {test_path}")


def save_model(accelerator, unet, text_encoder, args, step=None):
    unet = accelerator.unwrap_model(unet) 
    text_encoder = accelerator.unwrap_model(text_encoder)

    if step == None:
        folder = args.output_dir
    else:
        folder = args.output_dir + "-Step-" + str(step)

    print("Saving Model Checkpoint...")
    print("Directory: " + folder)

    # Create the pipeline using using the trained modules and save it.
    if accelerator.is_main_process:
        pipeline = StableDiffusionPipeline.from_pretrained(
            args.pretrained_model_name_or_path,
            unet=unet,
            text_encoder=text_encoder,
            revision=args.revision,
        )
        pipeline.save_pretrained(folder)
        del pipeline
        if torch.cuda.is_available():
            torch.cuda.empty_cache()
        
        if args.auto_test_model:
            print("Testing Model...")
            test_model(folder, args)

        if args.push_to_hub:
            repo.push_to_hub(commit_message="End of training", blocking=False, auto_lfs_prune=True)


def main(args):
    logging_dir = Path(args.logging_dir)

    accelerator = Accelerator(
        gradient_accumulation_steps=args.gradient_accumulation_steps,
        mixed_precision=args.mixed_precision,
        log_with=args.log_with,
        logging_dir=logging_dir,
    )


    # Currently, it's not possible to do gradient accumulation when training two models with accelerate.accumulate
    # This will be enabled soon in accelerate. For now, we don't allow gradient accumulation when training two models.
    # TODO (patil-suraj): Remove this check when gradient accumulation with two models is enabled in accelerate.
    if args.train_text_encoder and args.gradient_accumulation_steps > 1 and accelerator.num_processes > 1:
        raise ValueError(
            "Gradient accumulation is not supported when training the text encoder in distributed training. "
            "Please set gradient_accumulation_steps to 1. This feature will be supported in the future."
        )

    if args.seed is not None:
        set_seed(args.seed)

    if args.with_prior_preservation:
        class_images_dir = Path(args.class_data_dir)
        if not class_images_dir.exists():
            class_images_dir.mkdir(parents=True)
        cur_class_images = len(list(class_images_dir.iterdir()))

        if cur_class_images < args.num_class_images:
            torch_dtype = torch.float16 if accelerator.device.type == "cuda" else torch.float32
            pipeline = StableDiffusionPipeline.from_pretrained(
                args.pretrained_model_name_or_path,
                torch_dtype=torch_dtype,
                safety_checker=None,
                revision=args.revision,
            )
            pipeline.set_progress_bar_config(disable=True)

            num_new_images = args.num_class_images - cur_class_images
            logger.info(f"Number of class images to sample: {num_new_images}.")

            sample_dataset = PromptDataset(args.class_prompt, num_new_images)
            sample_dataloader = torch.utils.data.DataLoader(sample_dataset, batch_size=args.sample_batch_size)

            sample_dataloader = accelerator.prepare(sample_dataloader)
            pipeline.to(accelerator.device)

            for example in tqdm(
                sample_dataloader, desc="Generating class images", disable=not accelerator.is_local_main_process
            ):
                images = pipeline(example["prompt"]).images

                for i, image in enumerate(images):
                    hash_image = hashlib.sha1(image.tobytes()).hexdigest()
                    image_filename = class_images_dir / f"{example['index'][i] + cur_class_images}-{hash_image}.jpg"
                    image.save(image_filename)

            del pipeline
            if torch.cuda.is_available():
                torch.cuda.empty_cache()

    # Handle the repository creation
    if accelerator.is_main_process:
        if args.push_to_hub:
            if args.hub_model_id is None:
                repo_name = get_full_repo_name(Path(args.output_dir).name, token=args.hub_token)
            else:
                repo_name = args.hub_model_id
            repo = Repository(args.output_dir, clone_from=repo_name)

            with open(os.path.join(args.output_dir, ".gitignore"), "w+") as gitignore:
                if "step_*" not in gitignore:
                    gitignore.write("step_*\n")
                if "epoch_*" not in gitignore:
                    gitignore.write("epoch_*\n")
        elif args.output_dir is not None:
            os.makedirs(args.output_dir, exist_ok=True)

    # Load the tokenizer
    if args.tokenizer_name:
        tokenizer = CLIPTokenizer.from_pretrained(
            args.tokenizer_name,
            revision=args.revision,
        )
    elif args.pretrained_model_name_or_path:
        tokenizer = CLIPTokenizer.from_pretrained(
            args.pretrained_model_name_or_path,
            subfolder="tokenizer",
            revision=args.revision,
        )

    # Load models and create wrapper for stable diffusion
    text_encoder = CLIPTextModel.from_pretrained(
        args.pretrained_model_name_or_path,
        subfolder="text_encoder",
        revision=args.revision,
    )
    vae = AutoencoderKL.from_pretrained(
        args.pretrained_model_name_or_path,
        subfolder="vae",
        revision=args.revision,
    )
    unet = UNet2DConditionModel.from_pretrained(
        args.pretrained_model_name_or_path,
        subfolder="unet",
        revision=args.revision,
    )

    vae.requires_grad_(False)
    if not args.train_text_encoder:
        text_encoder.requires_grad_(False)

    if args.gradient_checkpointing:
        unet.enable_gradient_checkpointing()
        if args.train_text_encoder:
            text_encoder.gradient_checkpointing_enable()

    if args.scale_lr:
        args.learning_rate = (
            args.learning_rate * args.gradient_accumulation_steps * args.train_batch_size * accelerator.num_processes
        )

    # Use 8-bit Adam for lower memory usage or to fine-tune the model in 16GB GPUs
    if args.use_8bit_adam:
        try:
            import bitsandbytes as bnb
        except ImportError:
            raise ImportError(
                "To use 8-bit Adam, please install the bitsandbytes library: `pip install bitsandbytes`."
            )

        optimizer_class = bnb.optim.AdamW8bit
    else:
        optimizer_class = torch.optim.AdamW

    params_to_optimize = (
        itertools.chain(unet.parameters(), text_encoder.parameters()) if args.train_text_encoder else unet.parameters()
    )
    optimizer = optimizer_class(
        params_to_optimize,
        lr=args.learning_rate,
        betas=(args.adam_beta1, args.adam_beta2),
        weight_decay=args.adam_weight_decay,
        eps=args.adam_epsilon,
    )

    noise_scheduler = DDPMScheduler.from_config(args.pretrained_model_name_or_path, subfolder="scheduler")

    train_dataset = DreamBoothDataset(
        instance_data_root=args.instance_data_dir,
        instance_prompt=args.instance_prompt,
        class_data_root=args.class_data_dir if args.with_prior_preservation else None,
        class_prompt=args.class_prompt,
        tokenizer=tokenizer,
        size=args.resolution,
        center_crop=args.center_crop,
        use_filename_as_label=args.use_filename_as_label,
        use_txt_as_label=args.use_txt_as_label,
    )

    def collate_fn(examples):
        input_ids = [example["instance_prompt_ids"] for example in examples]
        pixel_values = [example["instance_images"] for example in examples]

        # Concat class and instance examples for prior preservation.
        # We do this to avoid doing two forward passes.
        if args.with_prior_preservation:
            input_ids += [example["class_prompt_ids"] for example in examples]
            pixel_values += [example["class_images"] for example in examples]

        pixel_values = torch.stack(pixel_values)
        pixel_values = pixel_values.to(memory_format=torch.contiguous_format).float()

        input_ids = tokenizer.pad({"input_ids": input_ids}, padding=True, return_tensors="pt").input_ids

        batch = {
            "input_ids": input_ids,
            "pixel_values": pixel_values,
        }
        return batch

    train_dataloader = torch.utils.data.DataLoader(
        train_dataset, batch_size=args.train_batch_size, shuffle=True, collate_fn=collate_fn, num_workers=1
    )

    # Scheduler and math around the number of training steps.
    overrode_max_train_steps = False
    num_update_steps_per_epoch = math.ceil(len(train_dataloader) / args.gradient_accumulation_steps)
    if args.max_train_steps is None:
        args.max_train_steps = args.num_train_epochs * num_update_steps_per_epoch
        overrode_max_train_steps = True

    lr_scheduler = get_scheduler(
        args.lr_scheduler,
        optimizer=optimizer,
        num_warmup_steps=args.lr_warmup_steps * args.gradient_accumulation_steps,
        num_training_steps=args.max_train_steps * args.gradient_accumulation_steps,
    )

    if args.train_text_encoder:
        unet, text_encoder, optimizer, train_dataloader, lr_scheduler = accelerator.prepare(
            unet, text_encoder, optimizer, train_dataloader, lr_scheduler
        )
    else:
        unet, optimizer, train_dataloader, lr_scheduler = accelerator.prepare(
            unet, optimizer, train_dataloader, lr_scheduler
        )

    weight_dtype = torch.float32
    if args.mixed_precision == "fp16":
        weight_dtype = torch.float16
    elif args.mixed_precision == "bf16":
        weight_dtype = torch.bfloat16

    # Move text_encode and vae to gpu.
    # For mixed precision training we cast the text_encoder and vae weights to half-precision
    # as these models are only used for inference, keeping weights in full precision is not required.
    vae.to(accelerator.device, dtype=weight_dtype)
    if not args.train_text_encoder:
        text_encoder.to(accelerator.device, dtype=weight_dtype)

    # We need to recalculate our total training steps as the size of the training dataloader may have changed.
    num_update_steps_per_epoch = math.ceil(len(train_dataloader) / args.gradient_accumulation_steps)
    if overrode_max_train_steps:
        args.max_train_steps = args.num_train_epochs * num_update_steps_per_epoch
    # Afterwards we recalculate our number of training epochs
    args.num_train_epochs = math.ceil(args.max_train_steps / num_update_steps_per_epoch)

    # We need to initialize the trackers we use, and also store our configuration.
    # The trackers initializes automatically on the main process.
    if accelerator.is_main_process:
        accelerator.init_trackers("dreambooth", config=vars(args))

    # Train!
    total_batch_size = args.train_batch_size * accelerator.num_processes * args.gradient_accumulation_steps

    logger.info("***** Running training *****")
    logger.info(f"  Num examples = {len(train_dataset)}")
    logger.info(f"  Num batches each epoch = {len(train_dataloader)}")
    logger.info(f"  Num Epochs = {args.num_train_epochs}")
    logger.info(f"  Instantaneous batch size per device = {args.train_batch_size}")
    logger.info(f"  Total train batch size (w. parallel, distributed & accumulation) = {total_batch_size}")
    logger.info(f"  Gradient Accumulation steps = {args.gradient_accumulation_steps}")
    logger.info(f"  Total optimization steps = {args.max_train_steps}")
    # Only show the progress bar once on each machine.
    progress_bar = tqdm(range(args.max_train_steps), disable=not accelerator.is_local_main_process)
    progress_bar.set_description("Steps")
    global_step = 0

    for epoch in range(args.num_train_epochs):
        unet.train()
        if args.train_text_encoder:
            text_encoder.train()
        for step, batch in enumerate(train_dataloader):
            with accelerator.accumulate(unet):
                # Convert images to latent space
                latents = vae.encode(batch["pixel_values"].to(dtype=weight_dtype)).latent_dist.sample()
                latents = latents * 0.18215

                # Sample noise that we'll add to the latents
                noise = torch.randn_like(latents)
                bsz = latents.shape[0]
                # Sample a random timestep for each image
                timesteps = torch.randint(0, noise_scheduler.config.num_train_timesteps, (bsz,), device=latents.device)
                timesteps = timesteps.long()

                # Add noise to the latents according to the noise magnitude at each timestep
                # (this is the forward diffusion process)
                noisy_latents = noise_scheduler.add_noise(latents, noise, timesteps)

                # Get the text embedding for conditioning
                encoder_hidden_states = text_encoder(batch["input_ids"])[0]

                # Predict the noise residual
                noise_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample

                if args.with_prior_preservation:
                    # Chunk the noise and noise_pred into two parts and compute the loss on each part separately.
                    noise_pred, noise_pred_prior = torch.chunk(noise_pred, 2, dim=0)
                    noise, noise_prior = torch.chunk(noise, 2, dim=0)

                    # Compute instance loss
                    loss = F.mse_loss(noise_pred.float(), noise.float(), reduction="none").mean([1, 2, 3]).mean()

                    # Compute prior loss
                    prior_loss = F.mse_loss(noise_pred_prior.float(), noise_prior.float(), reduction="mean")

                    # Add the prior loss to the instance loss.
                    loss = loss + args.prior_loss_weight * prior_loss
                else:
                    loss = F.mse_loss(noise_pred.float(), noise.float(), reduction="mean")

                accelerator.backward(loss)
                if accelerator.sync_gradients:
                    params_to_clip = (
                        itertools.chain(unet.parameters(), text_encoder.parameters())
                        if args.train_text_encoder
                        else unet.parameters()
                    )
                    accelerator.clip_grad_norm_(params_to_clip, args.max_grad_norm)
                optimizer.step()
                lr_scheduler.step()
                optimizer.zero_grad()

            # Checks if the accelerator has performed an optimization step behind the scenes
            if accelerator.sync_gradients:
                progress_bar.update(1)
                global_step += 1

            logs = {"loss": loss.detach().item(), "lr": lr_scheduler.get_last_lr()[0]}
            progress_bar.set_postfix(**logs)
            accelerator.log(logs, step=global_step)

            if global_step >= args.max_train_steps:
                break


            if args.save_model_every_n_steps != None and (global_step % args.save_model_every_n_steps) == 0:
                save_model(accelerator, unet, text_encoder, args, global_step)

        accelerator.wait_for_everyone()

    save_model(accelerator, unet, text_encoder, args, step=None)

    accelerator.end_training()


if __name__ == "__main__":
    args = parse_args()
    main(args)

# 用于训练特定物体/人物的方法（只需单一标签）
export MODEL_NAME="./model"
export INSTANCE_DIR="./datasets/test2"
export OUTPUT_DIR="./new_model"
export CLASS_DIR="./datasets/class" # 用于存放模型生成的先验知识的图片文件夹，请勿改动
export LOG_DIR="/root/tf-logs"
export TEST_PROMPTS_FILE="./test_prompts_object.txt"

rm -rf $CLASS_DIR/* # 如果你要训练与上次不同的特定物体/人物，需要先清空该文件夹。其他时候可以注释掉这一行（前面加#）
rm -rf $LOG_DIR/*

accelerate launch tools/train_dreambooth.py \
  --train_text_encoder \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --mixed_precision="fp16" \
  --instance_data_dir=$INSTANCE_DIR \
  --instance_prompt="a photo of  dog" \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --class_prompt="a photo of dog" \
  --class_data_dir=$CLASS_DIR \
  --num_class_images=200 \
  --output_dir=$OUTPUT_DIR \
  --logging_dir=$LOG_DIR \
  --center_crop \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 --gradient_checkpointing \
  --use_8bit_adam \
  --learning_rate=2e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --auto_test_model \
  --test_prompts_file=$TEST_PROMPTS_FILE \
  --test_seed=123 \
  --test_num_per_prompt=3 \
  --max_train_steps=1000 \
  --save_model_every_n_steps=500 

# 如果max_train_steps改大了，请记得把save_model_every_n_steps也改大
# 不然磁盘很容易中间就满了

# 以下是核心参数介绍：
# 主要的几个
# --train_text_encoder 训练文本编码器
# --mixed_precision="fp16" 混合精度训练
# - center_crop 
# 是否裁剪图片，一般如果你的数据集不是正方形的话，需要裁剪
# - resolution 
# 图片的分辨率，一般是512，使用该参数会自动缩放输入图像
# 可以配合center_crop使用，达到裁剪成正方形并缩放到512*512的效果
# - instance_prompt
# 如果你希望训练的是特定的人物，使用该参数
# 如 --instance_prompt="a photo of  girl"
# - class_prompt
# 如果你希望训练的是某个特定的类别，使用该参数可能提升一定的训练效果
# - use_txt_as_label
# 是否读取与图片同名的txt文件作为label
# 如果你要训练的是整个大模型的图像风格，那么可以使用该参数
# 该选项会忽略instance_prompt参数传入的内容
# - learning_rate
# 学习率，一般是2e-6，是训练中需要调整的关键参数
# 太大会导致模型不收敛，太小的话，训练速度会变慢
# - lr_scheduler, 可选项有constant, linear, cosine, cosine_with_restarts, cosine_with_hard_restarts
# 学习率调整策略，一般是constant，即不调整，如果你的数据集很大，可以尝试其他的，但是可能会导致模型不收敛，需要调整学习率
# - lr_warmup_steps，如果你使用的是constant，那么这个参数可以忽略，
# 如果使用其他的，那么这个参数可以设置为0，即不使用warmup
# 也可以设置为其他的值，比如1000，即在前1000个step中，学习率从0慢慢增加到learning_rate的值
# 一般不需要设置, 除非你的数据集很大，训练收敛很慢
# - max_train_steps
# 训练的最大步数，一般是1000，如果你的数据集比较大，那么可以适当增大该值
# - save_model_every_n_steps
# 每多少步保存一次模型，方便查看中间训练的结果找出最优的模型，也可以用于断点续训

# --with_prior_preservation，--prior_loss_weight=1.0，分别是使用先验知识保留和先验损失权重
# 如果你的数据样本比较少，那么可以使用这两个参数，可以提升训练效果，还可以防止过拟合（即生成的图片与训练的图片相似度过高）

# --auto_test_model, --test_prompts_file, --test_seed, --test_num_per_prompt
# 分别是自动测试模型（每save_model_every_n_steps步后）、测试的文本、随机种子、每个文本测试的次数
# 测试的样本图片会保存在模型输出目录下的test文件夹中

from diffusers import StableDiffusionPipeline
import torch
from diffusers import DDIMScheduler

model_path = "./new_model"  
prompt = "a cute girl, blue eyes, brown hair"
torch.manual_seed(123123123)

pipe = StableDiffusionPipeline.from_pretrained(
        model_path, 
        torch_dtype=torch.float16,
        scheduler=DDIMScheduler(
            beta_start=0.00085,
            beta_end=0.012,
            beta_schedule="scaled_linear",
            clip_sample=False,
            set_alpha_to_one=True,
        ),
        safety_checker=None
    )

# def dummy(images, **kwargs):
#     return images, False
# pipe.safety_checker = dummy
pipe = pipe.to("cuda")
images = pipe(prompt, width=512, height=512, num_inference_steps=30, num_images_per_prompt=3).images
for i, image in enumerate(images):
    image.save(f"test-{i}.png")

Point-E单图建模增加多样性

blender修改颜色、光照、纹理

参考：

[1]https://imagen.research.google/

[2]https://twitter.com/ai__pub/status/1561362542487695360

[3]https://stability.ai/blog/stable-diffusion-announcement

[4]https://arxiv.org/abs/2112.10752

[5]https://dreambooth.github.io/

[6]https://twitter.com/natanielruizg/status/1563166568195821569

[7]https://natanielruiz.github.io/

[8] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.

[9] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bjorn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022.

[10] Amir Hertz, Ron Mokady, Jay Tenenbaum, Kfir Aberman, Yael Pritch, and Daniel CohenOr. Prompt-to-prompt image editing with cross attention control. arXiv preprint arXiv:2208.01626, 2022.

[11] Tengfei Wang, Ting Zhang, Bo Zhang, Hao Ouyang, Dong Chen, Qifeng Chen, and Fang Wen. Pretraining is all you need for image-to-image translation. arXiv preprint arXiv:2205.12952, 2022.

[12]https://techcrunch.com/2022/12/20/openai-releases-point-e-an-ai-that-generates-3d-models/?tpcc=tcplustwitter

[13]https://www.engadget.com/openai-releases-point-e-dall-e-3d-text-modeling-210007892.html?src=rss

你可能感兴趣的:(海报生成,AIGC,AIGC)

移动conda虚拟环境的安装目录
方法1：重新创建环境（推荐）(1)导出环境配置（生成environment.yml）：condaactivateold_env#激活原环境condaenvexport>environment.yml#导出配置(llmtuner):~$condaenvexport>environment.yml(llmtuner):~$tail-fenvironment.yml-websockets==15.0.1
OneCode技术架构深度解析：自主UI体系、注解驱动与全栈开发的协同优势低代码老李 OneCode产品介绍 OneCode实战软件行业架构 ui
引言：低代码平台的技术基石在AIGC与数字化转型的双重驱动下，企业级低代码平台已从简单的界面搭建工具演进为全栈业务开发环境。OneCode作为国内领先的低代码开发平台，其核心竞争力源于三大技术支柱：自主可控的UI体系、注解驱动的开发模式和端到端的全栈支持能力。这三大支柱形成有机整体，使OneCode在开发效率、系统集成和业务适应性方面建立起显著优势。本文将深入剖析这些技术特性的实现原理与应用价值，
嵌入式Linux内核镜像生成过程飘逸轻舞 linux arm开发运维嵌入式
嵌入式Linux内核镜像生成过程嵌入式Linux系统的核心组件是内核，它是操作系统的核心部分，负责管理硬件资源、提供系统调用接口以及驱动设备等功能。在嵌入式系统中，将内核编译成镜像文件是部署系统的关键步骤之一。本文将介绍嵌入式Linux的内核镜像生成过程，并提供相应的源代码示例。获取Linux内核源代码首先，我们需要获取Linux内核的源代码。可以从Linux官方网站（www.kernel.org
STM32 开发笔记：从环境搭建到任务调度嵌入式的小萌新 stm32 笔记嵌入式硬件
今天体验了一把augment确实好用，记录一下STM32开发笔记：从环境搭建到任务调度️环境准备必需工具STM32CubeMX：图形化配置工具，用于初始化MCU外设和生成基础代码STM32CubeCLT：包含编译工具链（arm-none-eabi-gcc）和烧录工具（STM32_Programmer_CLI）CMake：跨平台构建系统，用于管理项目编译流程OpenOCD：开源调试器（可选，用于DA
ShaderGraph节点解析(136):矩形节点（Rectangle Node）详解小李也疯狂 #Unity ShaderGraph Rectangle
目录一、节点功能概述二、端口详解三、控制选项四、技术原理解析4.1数学原理（距离场计算）4.2生成代码解析4.3视觉特性五、应用场景与实战案例5.1UI元素（矩形按钮/面板）场景：在UI中生成无纹理的矩形按钮或面板，支持动态调整大小和圆角（配合其他节点）5.2材质纹理（网格/条纹）场景：为材质添加矩形网格或条纹纹理（如布料格子、屏幕像素感）5.3粒子形状（矩形粒子/条纹）场景：控制粒子的形状为矩形
ShaderGraph节点解析(124):绕轴旋转节点（Rotate About Axis Node）详解小李也疯狂 #unity ShaderGraph Unity
目录一、节点功能概述二、端口详解控制选项三、技术原理解析3.1数学基础：罗德里格斯旋转公式3.2旋转矩阵构造3.3生成代码解析1.弧度模式（Radians）2.度模式（Degrees）3.4旋转方向：右手定则四、应用场景与实战案例4.1角色骨骼旋转（动画驱动）场景：实现角色手臂绕肱骨（上臂骨）旋转，模拟弯曲动作4.2相机环绕效果（第三人称视角）场景：让相机绕目标物体（如角色）的Y轴旋转，实现环绕观
OpenCV 图像操作：颜色识别、替换与水印添加
目录引言代码实现1.导入必要的库2.图像加法3.图像直接相加4.颜色加权加法5.HSV颜色空间转换概念作用6.查找颜色范围对应的像素点7.与运算-生成掩膜8.添加水印9.主函数总结引言在计算机视觉领域，OpenCV是一个强大的库，提供了丰富的图像操作功能。本文将详细介绍如何使用OpenCV进行图像加法、颜色加权加法、HSV颜色空间转换、颜色范围查找、与运算生成掩膜以及添加水印等操作，并给出相应的P
企业级视频链接的技术实现与安全性策略
前言视频链接作为内容分发的关键入口，其参数设计直接影响系统安全性、用户体验和运营效率。一个标准化的视频链接应包含资源标识、访问控制和播放体验三类核心参数，同时保持结构清晰和可扩展性。视频链接的批量生成与管理策略1.高效批量生成技术针对运营场景的批量链接生成需求，实现高性能的生成方案：importcsvimportconcurrent.futuresfromtqdmimporttqdmclassBa
CALayer的异步处理
在iOS开发中，实现**CALayer**的异步处理是优化性能的关键技术，尤其对于复杂绘制或需要高性能渲染的场景。以下是完整实现方案：一、异步绘制核心架构设置异步绘制标志触发display创建异步任务执行绘制生成CGImage设置contents主线程CALayer实现displayLayer:方法全局队列CoreGraphics绘制主线程回调二、完整实现代码1.自定义异步图层//AsyncLay
大模型RLHF强化学习笔记（二）：强化学习基础梳理Part2 Gravity! 大模型笔记大模型 LLM 强化学习人工智能
【如果笔记对你有帮助，欢迎关注&点赞&收藏，收到正反馈会加快更新！谢谢支持！】一、强化学习基础1.4强化学习分类根据数据来源划分Online：智能体与环境实时交互，如Q-Learning、SARSA、Actor-CriticOffline：智能体使用预先收集的数据集进行学习根据策略更新划分On-Policy：学习和行为策略是相同的，数据是按照当前策略生成的，如SARSAOff-Policy：学习策
React this指向问题
state//1.创建组件classWeatherextendsReact.Component{//构造器调用几次？————1次constructor(props){console.log('constructor');super(props)//初始化状态this.state={isHot:false,wind:'微风'}//解决changeWeather中this指向问题//在生成的对象上加上
大语言模型应用指南：ReAct 框架 AI大模型应用实战 java python javascript kotlin golang 架构人工智能
大语言模型应用指南：ReAct框架关键词：大语言模型,ReAct框架,自然语言处理(NLP),模型融合,多模态学习,深度学习,深度学习框架1.背景介绍1.1问题由来近年来，深度学习技术在自然语言处理(NLP)领域取得了显著进展。尤其是大语言模型(LargeLanguageModels,LLMs)，如BERT、GPT系列等，通过在大规模无标签数据上进行预训练，获得了强大的语言理解和生成能力。然而，预
Pillow 安装使用教程小奇JAVA面试安装使用教程 pillow microsoft 深度学习
一、Pillow简介Pillow是Python图像处理库PIL（PythonImagingLibrary）的友好分支，是图像处理的事实标准。它支持打开、编辑、转换、保存多种图像格式，常用于图像批量处理、验证码识别、缩略图生成等应用场景。二、安装Pillow2.1使用pip安装（推荐）pipinstallPillow2.2验证安装importPILprint(PIL.__version__)若无报错
分布式压测活跃家族性能分布式
1.扩展：启动java项目，nohup生成文件写入项目相关输出信息，包括日志信息。想要看的话可以监听这个文件：tail-f,cat查看文件等。1、做性能测试，为什么要用分布式？1、机器的端口数量有限，在发发起请求的时候，端口不够用，无法发起访问，端口消耗完，解决：增加端口数量，增加机器，分布式压测修改系统参数，端口数量扩大，修改注册表，但是一般不做长链接改为短链接2、分布式原理：1、一台机器主控机
Cursor 使用教程：Java 单体架构中 AI 规则自定义的 CRUD 开发全流程程序员岳彬全栈开发 java 架构人工智能后端 AI编程 ai
一、Cursor自定义AI规则基础入门1.1什么是Cursor自定义AI规则Cursor是一款强大的AI编程助手，而自定义AI规则是Cursor中用于约束和指导AI行为的配置文件，它允许开发者根据项目的特定需求定制AI的响应方式。这些规则文件本质上是你与AI之间的"协议"，告诉AI你的项目架构、编码规范、技术栈偏好等信息，从而让AI生成更符合你期望的代码和建议。简单来说，Cursor自定义AI规则
Mybatis ＜trim＞标签的基本使用 MYGAG mybatis 服务器运维
在MyBatis的标签中，prefix和suffixOverrides属性的行为如下：-prefix="SET"：这意味着如果标签内部的任何条件成立（即，有任何内容被添加到SQL语句中），那么"SET"这个前缀就会被添加到生成的SQL语句的开始部分。无论标签内部有多少个条件成立，"SET"只会被添加一次。-suffixOverrides=","：这意味着如果标签内部的任何条件成立，生成的SQL语句
10.2 ChatGPT自动生成训练数据实战：37.2%准确率提升秘籍少林码僧掌握先机！从 0 起步实战 AI 大模型微调打造核心竞争力 chatgpt 人工智能机器学习语言模型
ChatGPT自动生成训练数据实战：37.2%准确率提升秘籍使用ChatGPT自动设计生成训练数据的Prompt在大模型微调场景中，高质量训练数据的获取往往是制约模型效果的核心瓶颈。根据2023年GoogleResearch的实证研究，使用GPT-4生成的合成数据对LLaMA2进行微调，能达到人工标注数据85%的效果水平。本章将揭秘如何通过ChatGPT自动生成适配ChatGLM3的微调数据。一、
爆改RAG！用强化学习让你的检索增强生成系统“开挂”——从小白到王者的实战指南许泽宇的技术分享人工智能
“RAG不准？RL来救场！”——一位被RAG气哭的AI工程师前言：RAG的烦恼与AI炼丹师的自我修养在AI圈混久了，大家都知道RAG（Retrieval-AugmentedGeneration，检索增强生成）是大模型落地的“万金油”方案。无论是企业知识库、智能问答，还是搜索引擎升级，RAG都能插上一脚。但你用过RAG就知道，理想很丰满，现实很骨感。明明知识库里啥都有，问个“量子比特的数学表达式”，
一文汇总VSCode多光标用法筑凡知识杂项 vscode 光标多光标批量
光标的创建按住alt，鼠标左键单击，在单击位置生成光标/删除光标按住ctrl+alt，单击↑/↓，在每行同一个位置（若某一行较短，则在行尾）生成光标，这个不会删除光标，只会在光标的上下界不断增加新光标按住ctrl+shift+alt，单击↑/↓，在每行的同一个位置（若某一行较短，则在行尾）生成光标/删除光标按住shift+alt+鼠标左键，拖动鼠标即可进行垂直方向上的列选择标红部分为鼠标起止位置c
「源力觉醒创作者计划」_文心大模型开源：开启 AI 新时代的大门小黄编程快乐屋人工智能
在人工智能的浩瀚星空中，大模型技术宛如一颗璀璨的巨星，照亮了无数行业前行的道路。自诞生以来，大模型凭借其强大的语言理解与生成能力，引发了全球范围内的技术变革与创新浪潮。百度宣布于6月30日开源文心大模型4.5系列，这一消息如同一颗重磅炸弹，在AI领域掀起了惊涛骇浪，其影响之深远，意义之重大，足以改写行业的发展轨迹。百度这次放大招，直接把文心大模型4.5开源了，这操作就像往国内AI圈子里空投了一个超
MySQL 的 B+ 树中查询数据的全过程 Chen-Edward mysql 数据库
是否是否是否接收SQL查询解析SQL,生成语法树优化器生成执行计划是否使用B+树索引?加载B+树根节点全表扫描遍历非叶子节点定位叶子节点查找目标键值找到目标键?获取数据返回空结果是否二级索引?通过主键回表直接获取整行数据组装结果集返回结果给客户端注意事项mysql中的叶子节点默认是16KB，存储的不只是一条数据，数据的多少是16kB/每条数据大约的大小从上图可以知晓，叶子节点有页目录结构（非叶子节
Coze智能体开发：什么是提示词及其编写建议王国平 Coze AI Agent智能体开发人工智能大数据语言模型 python 开发语言
提示词(Prompt)是AIAgent的核心，它决定了模型生成结果的质量和准确性。提示词不仅影响输出，还决定了模型对输入信息的理解深度。通过科学的提示词设计，开发者能高效引导模型生成符合预期的高质量输出。基础概念提示词提示词（Prompt）是用户在与模型或智能系统互动时输入的指令或文本，用来引导系统生成回应或执行特定任务。它可以是问题、命令或描述性文字，帮助系统理解用户的意图并提供相应的结果。提示
Coze智能体开发：如何批量生成和处理图片王国平 Coze AI Agent智能体开发语言模型人工智能开发语言智能体 Agent
在绘本制作、图片后期制作等场景中，往往需要使用模型来批量生成和处理图片。扣子提供了多个图像处理类节点，支持图像生成、添加水印、画质优化等多种常见的图片处理方式，你可以在批处理节点中嵌套图像生成等图像处理节点，实现图片的批量操作。本文档以绘本制作工作流为例，演示如何通过批处理节点和图像节点实现图像的批量生成和批量处理。效果演示通过绘本制作工作流，你可以批量生成类似以下风格的图片。搭建过程中你也可以根
Django5.1（91）—— 如何删除一个 Django 应用小天的铁蛋儿 django Python django python 后端
如何删除一个Django应用Django提供了将一组功能组织成名为应用程序的Python包的能力。当需求发生变化时，应用程序可能会变得过时或不再需要。以下步骤将帮助你安全地删除一个应用程序。删除所有与该应用程序相关的引用（导入、外键等）。从相应的models.py文件中删除所有模型。通过运行makemigrations来创建相关的迁移。这一步会生成一个迁移，用于删除已删除模型的表，以及与这些模型相
Spring Boot 牵手EasyExcel：解锁高效数据处理姿势灵犀学长 Spring Boot 全栈开发 spring boot java 架构微服务后端
引言在日常的Java开发中，处理Excel文件是一个极为常见的需求。无论是数据的导入导出，还是报表的生成，Excel都扮演着重要的角色。例如，在企业的财务管理系统中，需要将每月的财务数据导出为Excel报表，方便财务人员进行数据分析和审计；在人力资源管理系统中，可能需要导入员工的基本信息、考勤记录等数据到系统中。然而，传统的Excel处理方式，如使用POI等工具，虽然功能强大，但在面对复杂的业务场
python系列之：使用md5和sha256完成签名认证，调用接口快乐骑行^_^ 前端和后端开发 python系列使用md5和sha256 完成签名认证调用接口
python系列之：使用md5和sha256完成签名认证，调用接口MD5签名和sha256签名认证md5认证代码sha256认证代码拼接签名生成签名拼接url调用接口MD5签名和sha256签名认证MD5签名认证算法特性：生成128位(16字节)的哈希值计算速度快已被证明存在碰撞漏洞(不同输入可能产生相同输出)签名认证流程：发送方对原始数据计算MD5哈希值将哈希值附加到数据中发送接收方重新计算接收
java中，stream的filter和list的removeIf筛选速度比较码傻啦弟软件开发 java list python
在Java里，Stream的filter和List的removeIf筛选效率要依据具体情形来判断。1.操作本质有别Stream的filter：它是一种中间操作，不会立刻执行，而是把筛选条件记录下来。只有遇到终端操作时，才会开始处理元素。此操作不会对原集合进行修改，而是生成一个新的流。List的removeIf：这是一种终端操作，会立即对原集合进行修改，删除满足条件的元素。它直接在原集合上进行元素的
vscode remote-ssh 拓展免密访问 linux虚拟机
前置步骤，在linux安装好ssh并且win可以使用密码登录linuxsudoaptinstallopenssh-server-y在win上检查密钥是否存在检查公钥和私钥cat~/.ssh/id_rsa.pubcat~/.ssh/id_rsa如果不存在，重新生成ssh-keygen-trsa-b4096重新执行cat~/.ssh/id_rsa.pub将公钥的内容粘贴到linux下~/.ssh/au
推客系统开发：从0到1构建高效社交化推荐引擎 wx_ywyy6798 推客系统分销系统海外短剧系统推客小程序推客系统开发推客小程序开发推客分销系统
在信息爆炸的时代，如何让用户快速获取感兴趣的内容？推客系统（推荐引擎）成为解决这一问题的核心方案。无论是电商、内容平台还是社交应用，精准的推荐算法都能显著提升用户粘性和转化率。本文将带您了解推客系统的核心模块与开发要点，助您快速构建高效的推荐体系。一、推客系统的核心价值个性化体验：基于用户行为数据（浏览、点赞、收藏等）生成定制化推荐。流量高效分发：解决“信息过载”问题，提升内容/商品的曝光率。商业
简述C++ nlohmann/json 库 ikkkkkkkl json c++nlohmann
目录JSON概述nlohmann/json库的使用创建json数组/对象字符串解析（parse反序列化）数据访问序列化文件读写JSON概述JSON(JavaScripObjectNotation)是一种轻量级、跨语言的数据交换格式。它基于ECMAScript子集，以独立于编程语言的文本格式存储和表示数据，简洁清晰的结构使其成为理想的数据交换语言，易读、易写且便于机器解析生成，能提升网络传输效率。J
安装数据库首次应用 Array_06 java oracle sql
可是为什么再一次失败之后就变成直接跳过那个要求 enter full pathname of java.exe的界面这个java.exe是你的Oracle 11g安装目录中例如：【F:\app\chen\product\11.2.0\dbhome_1\jdk\jre\bin】下的java.exe 。不是你的电脑安装的java jdk下的java.exe！注意第一次，使用SQL D
Weblogic Server Console密码修改和遗忘解决方法 bijian1013 Welogic
在工作中一同事将Weblogic的console的密码忘记了，通过网上查询资料解决，实践整理了一下。一.修改Console密码打开weblogic控制台，安全领域 --> myrealm -->&n
IllegalStateException: Cannot forward a response that is already committed Cwind java Servlets
对于初学者来说，一个常见的误解是：当调用 forward() 或者 sendRedirect() 时控制流将会自动跳出原函数。标题所示错误通常是基于此误解而引起的。示例代码： protected void doPost() { if (someCondition) { sendRedirect(); } forward(); // Thi
基于流的装饰设计模式木zi_鸣设计模式
当想要对已有类的对象进行功能增强时，可以定义一个类，将已有对象传入，基于已有的功能，并提供加强功能。自定义的类成为装饰类模仿BufferedReader，对Reader进行包装，体现装饰设计模式装饰类通常会通过构造方法接受被装饰的对象，并基于被装饰的对象功能，提供更强的功能。装饰模式比继承灵活，避免继承臃肿，降低了类与类之间的关系装饰类因为增强已有对象，具备的功能该
Linux中的uniq命令被触发 linux
Linux命令uniq的作用是过滤重复部分显示文件内容，这个命令读取输入文件，并比较相邻的行。在正常情况下，第二个及以后更多个重复行将被删去，行比较是根据所用字符集的排序序列进行的。该命令加工后的结果写到输出文件中。输入文件和输出文件必须不同。如果输入文件用“- ”表示，则从标准输入读取。 AD： uniq [选项] 文件说明：这个命令读取输入文件，并比较相邻的行。在正常情况下，第二个
正则表达式Pattern 肆无忌惮_ Pattern
正则表达式是符合一定规则的表达式，用来专门操作字符串，对字符创进行匹配，切割，替换，获取。例如，我们需要对QQ号码格式进行检验规则是长度6~12位不能0开头只能是数字，我们可以一位一位进行比较，利用parseLong进行判断，或者是用正则表达式来匹配[1-9][0-9]{4,14} 或者 [1-9]\d{4,14} &nbs
Oracle高级查询之OVER (PARTITION BY ..) 知了ing oracle sql
一、rank()/dense_rank() over(partition by ...order by ...) 现在客户有这样一个需求，查询每个部门工资最高的雇员的信息，相信有一定oracle应用知识的同学都能写出下面的SQL语句： select e.ename, e.job, e.sal, e.deptno from scott.emp e, (se
Python调试矮蛋蛋 python pdb
原文地址： http://blog.csdn.net/xuyuefei1988/article/details/19399137 1、下面网上收罗的资料初学者应该够用了，但对比IBM的Python 代码调试技巧： IBM：包括 pdb 模块、利用 PyDev 和 Eclipse 集成进行调试、PyCharm 以及 Debug 日志进行调试： http://www.ibm.com/d
webservice传递自定义对象时函数为空，以及boolean不对应的问题 alleni123 webservice
今天在客户端调用方法 NodeStatus status=iservice.getNodeStatus(). 结果NodeStatus的属性都是null。进行debug之后，发现服务器端返回的确实是有值的对象。后来发现原来是因为在客户端，NodeStatus的setter全部被我删除了。本来是因为逻辑上不需要在客户端使用setter，结果改了之后竟然不能获取带属性值的
java如何干掉指针，又如何巧妙的通过引用来操作指针————>说的就是java指针百合不是茶
C语言的强大在于可以直接操作指针的地址，通过改变指针的地址指向来达到更改地址的目的,又是由于c语言的指针过于强大，初学者很难掌握， java的出现解决了c，c++中指针的问题 java将指针封装在底层，开发人员是不能够去操作指针的地址，但是可以通过引用来间接的操作：定义一个指针p来指向a的地址（&是地址符号）：
Eclipse打不开，提示“An error has occurred.See the log file ***/.log” bijian1013 eclipse
打开eclipse工作目录的\.metadata\.log文件，发现如下错误： !ENTRY org.eclipse.osgi 4 0 2012-09-10 09:28:57.139 !MESSAGE Application error !STACK 1 java.lang.NoClassDefFoundError: org/eclipse/core/resources/IContai
spring aop实例annotation方法实现 bijian1013 java spring AOP annotation
在spring aop实例中我们通过配置xml文件来实现AOP，这里学习使用annotation来实现，使用annotation其实就是指明具体的aspect,pointcut和advice。1.申明一个切面(用一个类来实现)在这个切面里,包括了advice和pointcut AdviceMethods.jav
[Velocity一]Velocity语法基础入门 bit1129 velocity
用户和开发人员参考文档 http://velocity.apache.org/engine/releases/velocity-1.7/developer-guide.html 注释 1.行级注释## 2.多行注释#* *# 变量定义使用$开头的字符串是变量定义，例如$var1, $var2, 赋值使用#set为变量赋值，例
【Kafka十一】关于Kafka的副本管理 bit1129 kafka
1. 关于request.required.acks request.required.acks控制者Producer写请求的什么时候可以确认写成功，默认是0， 0表示即不进行确认即返回。 1表示Leader写成功即返回，此时还没有进行写数据同步到其它Follower Partition中 -1表示根据指定的最少Partition确认后才返回，这个在 Th
lua统计nginx内部变量数据 ronin47 lua nginx　统计
server { listen 80; server_name photo.domain.com; location /{set $str $uri; content_by_lua ' local url = ngx.var.uri local res = ngx.location.capture(
java-11.二叉树中节点的最大距离 bylijinnan java
import java.util.ArrayList; import java.util.List; public class MaxLenInBinTree { /* a. 1 / \ 2 3 / \ / \ 4 5 6 7 max=4 pass "root"
Netty源码学习-ReadTimeoutHandler bylijinnan java netty
ReadTimeoutHandler的实现思路：开启一个定时任务，如果在指定时间内没有接收到消息，则抛出ReadTimeoutException 这个异常的捕获，在开发中，交给跟在ReadTimeoutHandler后面的ChannelHandler，例如 private final ChannelHandler timeoutHandler = new ReadTim
jquery验证上传文件样式及大小(好用) cngolon 文件上传 jquery验证
<!DOCTYPE html> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <script src="jquery1.8/jquery-1.8.0.
浏览器兼容【转】 cuishikuan css 浏览器 IE
浏览器兼容问题一：不同浏览器的标签默认的外补丁和内补丁不同问题症状：随便写几个标签，不加样式控制的情况下，各自的margin 和padding差异较大。碰到频率:100% 解决方案：CSS里 *{margin:0;padding:0;} 备注：这个是最常见的也是最易解决的一个浏览器兼容性问题，几乎所有的CSS文件开头都会用通配符*来设
Shell特殊变量：Shell $0, $#, $*, $@, $?, $$和命令行参数 daizj shell $#$?特殊变量
前面已经讲到，变量名只能包含数字、字母和下划线，因为某些包含其他字符的变量有特殊含义，这样的变量被称为特殊变量。例如，$ 表示当前Shell进程的ID，即pid，看下面的代码： $echo $$ 运行结果 29949 特殊变量列表变量含义 $0 当前脚本的文件名 $n 传递给脚本或函数的参数。n 是一个数字，表示第几个参数。例如，第一个
程序设计KISS 原则-------KEEP IT SIMPLE, STUPID! dcj3sjt126com unix
翻到一本书，讲到编程一般原则是kiss：Keep It Simple, Stupid.对这个原则深有体会，其实不仅编程如此，而且系统架构也是如此。 KEEP IT SIMPLE, STUPID! 编写只做一件事情，并且要做好的程序；编写可以在一起工作的程序，编写处理文本流的程序，因为这是通用的接口。这就是UNIX哲学.所有的哲学真正的浓缩为一个铁一样的定律，高明的工程师的神圣的“KISS 原
android Activity间List传值 dcj3sjt126com Activity
第一个Activity： import java.util.ArrayList;import java.util.HashMap;import java.util.List;import java.util.Map;import android.app.Activity;import android.content.Intent;import android.os.Bundle;import a
tomcat 设置java虚拟机内存 eksliang tomcat 内存设置
转载请出自出处：http://eksliang.iteye.com/blog/2117772 http://eksliang.iteye.com/ 常见的内存溢出有以下两种: java.lang.OutOfMemoryError: PermGen space java.lang.OutOfMemoryError: Java heap space ------------
Android 数据库事务处理 gqdy365 android
使用SQLiteDatabase的beginTransaction()方法可以开启一个事务，程序执行到endTransaction() 方法时会检查事务的标志是否为成功，如果程序执行到endTransaction()之前调用了setTransactionSuccessful() 方法设置事务的标志为成功则提交事务，如果没有调用setTransactionSuccessful() 方法则回滚事务。事
Java 打开浏览器 hw1287789687 打开网址 open浏览器 open browser 打开url 打开浏览器
使用java 语言如何打开浏览器呢? 我们先研究下在cmd窗口中,如何打开网址使用IE 打开 D:\software\bin>cmd /c start iexplore http://hw1287789687.iteye.com/blog/2153709 使用火狐打开 D:\software\bin>cmd /c start firefox http://hw1287789
ReplaceGoogleCDN：将 Google CDN 替换为国内的 Chrome 插件 justjavac chrome Google google api chrome插件
Chrome Web Store 安装地址： https://chrome.google.com/webstore/detail/replace-google-cdn/kpampjmfiopfpkkepbllemkibefkiice 由于众所周知的原因，只需替换一个域名就可以继续使用Google提供的前端公共库了。同样，通过script标记引用这些资源，让网站访问速度瞬间提速吧
进程VS.线程 m635674608 线程
资料来源： http://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/001397567993007df355a3394da48f0bf14960f0c78753f000 1、Apache最早就是采用多进程模式 2、IIS服务器默认采用多线程模式 3、多进程优缺点优点：多进程模式最大
Linux下安装MemCached 字符串 memcached
前提准备：1. MemCached目前最新版本为：1.4.22，可以从官网下载到。2. MemCached依赖libevent，因此在安装MemCached之前需要先安装libevent。2.1 运行下面命令，查看系统是否已安装libevent。[root@SecurityCheck ~]# rpm -qa|grep libevent libevent-headers-1.4.13-4.el6.n
java设计模式之--jdk动态代理（实现aop编程） Supanccy2013 java DAO 设计模式 AOP
与静态代理类对照的是动态代理类，动态代理类的字节码在程序运行时由Java反射机制动态生成，无需程序员手工编写它的源代码。动态代理类不仅简化了编程工作，而且提高了软件系统的可扩展性，因为Java 反射机制可以生成任意类型的动态代理类。java.lang.reflect 包中的Proxy类和InvocationHandler 接口提供了生成动态代理类的能力。 &
Spring 4.2新特性-对java8默认方法(default method)定义Bean的支持 wiselyman spring 4
2.1 默认方法(default method) java8引入了一个default medthod; 用来扩展已有的接口,在对已有接口的使用不产生任何影响的情况下,添加扩展使用default关键字 Spring 4.2支持加载在默认方法里声明的bean 2.2 将要被声明成bean的类 public class DemoService {