CGAN及代码实现

前言

  • 本文主要介绍CGAN及其代码实现
  • 阅读本文之前,建议先阅读GAN(生成对抗网络)
  • 本文基于一次课程实验,代码仅上传了需要补充部分

CGAN

全称: C o n d i t i o n a l   G e n e r a t i v e   A d v e r s a r i a l   N e t w o r k Conditional \,Generative\, Adversarial\, Network ConditionalGenerativeAdversarialNetwork
我们知道, G A N GAN GAN 其实又叫做 U n c o n d i t i o n a l   G e n e r a t i v e   A d v e r s a r i a l   N e t w o r k Unconditional\, Generative\, Adversarial\, Network UnconditionalGenerativeAdversarialNetwork

在基本的 G A N GAN GAN 上对 G e n e r a t o r Generator Generator D i s c r i m i n a t o r Discriminator Discriminator 的输入都添加了 l a b e l s labels labels,使得我们可以针对类别训练,控制生成图片的类别,而使得结果不那么随机

CGAN及代码实现_第1张图片

  • z z z 为服从一定分布的随机向量
  • x x x 为图像
  • y y y 为控制类别的 l a b l e s lables lables

Genarator

  • 输入:潜在空间的一批点(向量)和一批 label
  • 输出:一批图片

代码整体如下

class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
		
        # 将label编码成向量
        self.label_embedding = nn.Embedding(opt.n_classes, opt.label_dim) #10 , 50
        ## TODO: There are many ways to implement the model,  one alternative 
        ## architecture is (100+50)--->128--->256--->512--->1024--->(1,28,28)

        ### START CODE HERE
        def block(in_feat, out_feat, normalize=True):
            layers = [nn.Linear(in_feat, out_feat)]
            if normalize:
                layers.append(nn.BatchNorm1d(out_feat, 0.8))
            layers.append(nn.LeakyReLU(0.2, inplace=True))
            return layers

        self.model = nn.Sequential(
            *block(opt.latent_dim + opt.label_dim, 128, normalize=False),
            *block(128, 256),
            *block(256, 512),
            *block(512, 1024),
            nn.Linear(1024, int(np.prod(img_shape))),
            nn.Tanh()
        )
        ### END CODE HERE

    def forward(self, noise, labels):
       
        ### START CODE HERE
        # Concatenate label embedding and image to produce input
        gen_input = torch.cat((self.label_embedding(labels), noise), -1) #拼接两个向量
        img = self.model(gen_input)
        img = img.view(img.size(0), *img_shape)
        return img
        ### END CODE HERE
        
        return 

详细解读

  • nn.Embedding(num_embeddings , embedding_dim)
    • 将输入信息编码成向量
    • num_embeddings 代表最多可以编码几个数据
    • embedding_dim 代表将每个数据编码成一个几维向量
    import torch.nn as nn
    embedding = nn.Embedding(10, 3)
    a =  torch.LongTensor([[1,2,4,5],[4,3,2,9]])
    b =  torch.LongTensor([1 , 2 , 3])
    print(embedding(a))
    >>tensor([[[-0.3592, -2.2254, -1.7580],
             [ 1.7920, -0.6600, -1.1435],
             [-0.8874,  0.2585, -1.0378],
             [ 0.4861,  0.3025, -1.0556]],
    
            [[-0.8874,  0.2585, -1.0378],
             [-0.0752, -0.1548, -0.7140],
             [ 1.7920, -0.6600, -1.1435],
             [-2.5180,  0.2028, -1.4452]]], grad_fn=<EmbeddingBackward>)
    print(embedding(b))
    >>tensor([[-0.3592, -2.2254, -1.7580],
            [ 1.7920, -0.6600, -1.1435],
            [-0.0752, -0.1548, -0.7140]], grad_fn=<EmbeddingBackward>)
    
  • nn.Linear()
     layers = [nn.Linear(in_feat, out_feat)]
    
    • nn.Linear()用于设置网络中的全连接层。全连接层的输入与输出可以是多维的
    • in_feat 决定了输入张量的 size
    • out_feat 决定了输出张量的 size
  • nn.Sequential()
    • 用于设置模型顺序执行的部分
  • 前向传播
    • 拼接 label 和 noise 向量,得到输入向量
    • 直接调用 self.model,对 gen_inupt 进行系列操作
    • 转成二维向量返回

Discriminator

  • 输入:一批图片和图片对应的label
  • 输出:real or fake(1 / 0)

代码整体如下

class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()

        self.label_embedding = nn.Embedding(opt.n_classes, opt.label_dim)#10,50
        ## TODO: There are many ways to implement the discriminator,  one alternative 
        ## architecture is (100+784)--->512--->512--->512--->1
        
        ### START CODE HERE
        self.model = nn.Sequential(
            nn.Linear(opt.label_dim + int(np.prod(img_shape)), 512),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(512, 512),
            nn.Dropout(0.4),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(512, 512),
            nn.Dropout(0.4),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(512, 1),
        )
        ### END CODE HERE
        
       
    def forward(self, img, labels):
        ### START CODE HERE
        # Concatenate label embedding and image to produce input
        d_in = torch.cat((img.view(img.size(0), -1), self.label_embedding(labels)), -1)
        validity = self.model(d_in)
        ### END CODE HERE
        
        return validity
  • 和Generator类似
    • 输入是一批图像和对应标签
    • 训练过程写在 nn.Sequential 里
    • 返回的结果是一批图像的判断(真为1,假为0)

训练过程

  • 代码讲解写在注释里了
## TODO: implement the training process

for epoch in range(opt.n_epochs):
    for i, (imgs, labels) in enumerate(dataloader):

        batch_size = imgs.shape[0]

        # Adversarial ground truths
        #创建一个大小为(batch_size,1),数值全为 1.0 的 tensor
        valid = FloatTensor(batch_size, 1).fill_(1.0)
        
        #创建一个大小为(batch_size,1),数值全为 0.0 的 tensor
        fake = FloatTensor(batch_size, 1).fill_(0.0)

        # Configure input
        real_imgs = imgs.type(FloatTensor)
        labels = labels.type(LongTensor)
  
        # -----------------
        #  Train Generator
        # -----------------

        ### START CODE HERE
        optimizer_G.zero_grad()

        # Sample noise and labels as generator input
        #生成一批 noise
        z = Variable(FloatTensor(np.random.normal(0, 1, (batch_size, opt.latent_dim))))
        #生成一批 label
        gen_labels = Variable(LongTensor(np.random.randint(0, opt.n_classes, batch_size)))

        # 输入 z 和 gen_labels ,通过生成器,生成一批图片
        gen_imgs = generator(z, gen_labels)

        # Loss measures generator's ability to fool the discriminator
        # 通过判别器,判断生成图像的真假,返回一批图像的判别结果
        validity = discriminator(gen_imgs, gen_labels)
        # 判别为假的产生loss,这里计算生成器的loss
        g_loss = adversarial_loss(validity, valid)
		# BP + 更新
        g_loss.backward()
        optimizer_G.step()
        ### END CODE HERE

        # ---------------------
        #  Train Discriminator
        # ---------------------

        ### START CODE HERE
        optimizer_D.zero_grad()

        # 计算真实图片的 loss
        validity_real = discriminator(real_imgs, labels)
        d_real_loss = adversarial_loss(validity_real, valid)

        # 计算生成图片的 loss
        validity_fake = discriminator(gen_imgs.detach(), gen_labels)
        d_fake_loss = adversarial_loss(validity_fake, fake)

        # Total loss
        d_loss = (d_real_loss + d_fake_loss) / 2
		
        # BP + 更新
        d_loss.backward()
        optimizer_D.step()
        ### END CODE HERE

        print(
            "[Epoch %d/%d] [Batch %d/%d] [D loss: %f] [G loss: %f]"
            % (epoch, opt.n_epochs, i, len(dataloader), d_loss.item(), g_loss.item())
        )
    if (epoch+1) % 20 ==0:
        torch.save(generator.state_dict(), "./cgan_generator %d.pth" % (epoch))

最后测试,生成图像

def generate_latent_points(latent_dim, n_samples, n_classes):
    # Sample noise
    
    ### START CODE HERE
    # 随机生成向量和标签,作为测试使用
    z = Variable(FloatTensor(np.random.normal(0, 1, (n_samples, latent_dim))))
    gen_labels = Variable(LongTensor(np.random.randint(0, n_classes, n_samples)))
    ### END CODE HERE
    
    return z,gen_labels

CGAN及代码实现_第2张图片

你可能感兴趣的:(深度学习,深度学习,神经网络,pytorch)