我又死了我又死了我又死了!
CGAN一种带条件约束的GAN,在生成模型(D)和判别模型(G)的建模中均引入条件变量y(conditional variable y)。
使用额外信息y对模型增加条件,可以指导数据生成过程。这些条件变量y可以基于多种信息,例如类别标签,用于图像修复的部分数据,来自不同模态(modality)的数据。
如果条件变量y是类别标签,可以看做CGAN是把纯无监督的 GAN 变成有监督的模型的一种改进。
这个简单直接的改进被证明非常有效。
简单来讲,普通的GAN输入的是一个N维的正态分布随机数,而CGAN会为这个随机数添上标签,其利用Embedding层将正整数(索引值)转换为固定尺寸的稠密向量,并将这个稠密向量与N维的正态分布随机数相乘,从而获得一个有标签的随机数。
生成网络的输入是一个带标签的随机数,具体操作方式是生成一个N维的正态分布随机数,再利用Embedding层将正整数(索引值)转换为N维的稠密向量,并将这个稠密向量与N维的正态分布随机数相乘。
def build_generator(self):
model = Sequential()
model.add(Dense(256, input_dim=self.latent_dim))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(1024))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(np.prod(self.img_shape), activation='tanh'))
model.add(Reshape(self.img_shape))
# 输入一个数字,将其转换为固定尺寸的稠密向量
# 输出维度是self.latent_dim
label = Input(shape=(1,), dtype='int32')
label_embedding = Flatten()(Embedding(self.num_classes, self.latent_dim)(label))
# 将正态分布和索引对应的稠密向量相乘
noise = Input(shape=(self.latent_dim,))
model_input = multiply([noise, label_embedding])
img = model(model_input)
return Model([noise, label], img)
普通GAN的判别模型的目的是根据输入的图片判断出真伪。
在CGAN中,其不仅要判断出真伪,还要判断出种类。
因此它的输入一个28,28,1维的图片,输出有两个:
一个是0到1之间的数,1代表判断这个图片是真的,0代表判断这个图片是假的。与普通GAN不同的是,它使用的是卷积神经网络。
另一个是一个向量,用于判断这张图片属于什么类。
def build_discriminator(self):
model = Sequential()
model.add(Flatten(input_shape=self.img_shape))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.4))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.4))
model.summary()
label = Input(shape=(1,), dtype='int32')
img = Input(shape=self.img_shape)
features = model(img)
# 一个是真伪,一个是类别向量
validity = Dense(1, activation="sigmoid")(features)
label = Dense(self.num_classes, activation="softmax")(features)
return Model(img, [validity, label])
CGAN的训练和GAN不太一样,分为如下几个步骤:
1、随机选取batch_size个真实的图片和它的标签。
2、随机生成batch_size个N维向量和其对应的标签label,利用Embedding层进行组合,传入到Generator中生成batch_size个虚假图片。
3、Discriminator的loss函数由两部分组成,一部分是真伪的判断结果与真实情况的对比,一部分是图片所属标签的判断结果。
4、Generator的loss函数也由两部分组成,一部分是生成的图片是否被Discriminator判断为1,另一部分是生成的图片是否被分成了正确的类。
from __future__ import print_function, division
from keras.datasets import mnist
from keras.layers import Input, Dense, Reshape, Flatten, Dropout, multiply
from keras.layers import BatchNormalization, Activation, Embedding, ZeroPadding2D
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import UpSampling2D, Conv2D
from keras.models import Sequential, Model
from keras.optimizers import Adam
import matplotlib.pyplot as plt
import os
import numpy as np
class CGAN():
def __init__(self):
# 输入shape
self.img_rows = 28
self.img_cols = 28
self.channels = 1
self.img_shape = (self.img_rows, self.img_cols, self.channels)
# 分十类
self.num_classes = 10
self.latent_dim = 100
# adam优化器
optimizer = Adam(0.0002, 0.5)
# 判别模型
losses = ['binary_crossentropy', 'sparse_categorical_crossentropy']
self.discriminator = self.build_discriminator()
self.discriminator.compile(loss=losses,
optimizer=optimizer,
metrics=['accuracy'])
# 生成模型
self.generator = self.build_generator()
# conbine是生成模型和判别模型的结合
# 判别模型的trainable为False
# 用于训练生成模型
noise = Input(shape=(self.latent_dim,))
label = Input(shape=(1,))
img = self.generator([noise, label])
self.discriminator.trainable = False
valid, target_label = self.discriminator(img)
self.combined = Model([noise, label], [valid, target_label])
self.combined.compile(loss=losses,
optimizer=optimizer)
def build_generator(self):
model = Sequential()
model.add(Dense(256, input_dim=self.latent_dim))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(1024))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(np.prod(self.img_shape), activation='tanh'))
model.add(Reshape(self.img_shape))
# 输入一个数字,将其转换为固定尺寸的稠密向量
# 输出维度是self.latent_dim
label = Input(shape=(1,), dtype='int32')
label_embedding = Flatten()(Embedding(self.num_classes, self.latent_dim)(label))
# 将正态分布和索引对应的稠密向量相乘
noise = Input(shape=(self.latent_dim,))
model_input = multiply([noise, label_embedding])
img = model(model_input)
return Model([noise, label], img)
def build_discriminator(self):
model = Sequential()
model.add(Flatten(input_shape=self.img_shape))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.4))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.4))
label = Input(shape=(1,), dtype='int32')
img = Input(shape=self.img_shape)
features = model(img)
# 一个是真伪,一个是类别向量
validity = Dense(1, activation="sigmoid")(features)
label = Dense(self.num_classes, activation="softmax")(features)
return Model(img, [validity, label])
def train(self, epochs, batch_size=128, sample_interval=50):
# 载入数据库
(X_train, y_train), (_, _) = mnist.load_data()
# 归一化
X_train = (X_train.astype(np.float32) - 127.5) / 127.5
X_train = np.expand_dims(X_train, axis=3)
y_train = y_train.reshape(-1, 1)
valid = np.ones((batch_size, 1))
fake = np.zeros((batch_size, 1))
for epoch in range(epochs):
# --------------------- #
# 训练鉴别模型
# --------------------- #
idx = np.random.randint(0, X_train.shape[0], batch_size)
imgs, labels = X_train[idx], y_train[idx]
# ---------------------- #
# 生成正态分布的输入
# ---------------------- #
noise = np.random.normal(0, 1, (batch_size, self.latent_dim))
sampled_labels = np.random.randint(0, 10, (batch_size, 1))
gen_imgs = self.generator.predict([noise, sampled_labels])
d_loss_real = self.discriminator.train_on_batch(imgs, [valid, labels])
d_loss_fake = self.discriminator.train_on_batch(gen_imgs, [fake, sampled_labels])
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)
# --------------------- #
# 训练生成模型
# --------------------- #
g_loss = self.combined.train_on_batch([noise, sampled_labels], [valid, sampled_labels])
print ("%d [D loss: %f, acc.: %.2f%%, op_acc: %.2f%%] [G loss: %f]" % (epoch, d_loss[0], 100*d_loss[3], 100*d_loss[4], g_loss[0]))
if epoch % sample_interval == 0:
self.sample_images(epoch)
def sample_images(self, epoch):
r, c = 2, 5
noise = np.random.normal(0, 1, (r * c, 100))
sampled_labels = np.arange(0, 10).reshape(-1, 1)
gen_imgs = self.generator.predict([noise, sampled_labels])
gen_imgs = 0.5 * gen_imgs + 0.5
fig, axs = plt.subplots(r, c)
cnt = 0
for i in range(r):
for j in range(c):
axs[i,j].imshow(gen_imgs[cnt,:,:,0], cmap='gray')
axs[i,j].set_title("Digit: %d" % sampled_labels[cnt])
axs[i,j].axis('off')
cnt += 1
fig.savefig("images/%d.png" % epoch)
plt.close()
if __name__ == '__main__':
if not os.path.exists("./images"):
os.makedirs("./images")
cgan = CGAN()
cgan.train(epochs=20000, batch_size=256, sample_interval=200)