PyTorch实现风格迁移(Style Transfer)始祖文章

PyTorch实现风格迁移(Style Transfer)始祖论文《A Neural Algorithm of Artistic Style》(http://arxiv.org/abs/1508.06576).

简介

本人最近研究涉及风格迁移,尝试运行了其他一些关于《A Neural Algorithm of Artistic Style》的PyTorch代码,然而,它们的输出在某些epoch可能会突然变成噪音,然后又逐渐变回来,例如图1中的epochs 45、170和230。我不知道为什么会这样,因为原始lua版本的程序是很稳定的。

图1

因此,本人简单的实现了论文《A Neural Algorithm of Artistic Style》中的方法如下。实现的比较简单,主程序不到100行。

from vgg import Vgg16
import utils
from torchvision import transforms
import torch.nn as nn
import torch.optim as optim

# Set args.
# You can use argparse.ArgumentParser() to instead

EPOCHS = 100000000
STYLE_IMG_PATH = './style/square.jpg'
CONTENT_IMG_PATH = './content/boy.bmp'
OUTPUT_DIR = './output/'
IMAGE_SIZE = 512
BATCH_SIZE = 1
LEARNING_RATE = 0.01
CONTENT_WEIGHT = 100
STYLE_WEIGHT = 10000000

transform = transforms.Compose([
    transforms.Resize(IMAGE_SIZE),
    transforms.ToTensor(),
])

style_transform = transforms.Compose([
    transforms.Resize(IMAGE_SIZE),
    transforms.ToTensor(),
])

vgg = Vgg16(requires_grad=False).cuda()  # vgg16 model

style_img = utils.load_image(filename=STYLE_IMG_PATH, size=IMAGE_SIZE)
content_img = utils.load_image(filename=CONTENT_IMG_PATH, size=IMAGE_SIZE)

style_img = style_transform(style_img)
content_img = transform(content_img)

style_img = style_img.repeat(BATCH_SIZE, 1, 1, 1).cuda()  # make fake batch
content_img = content_img.repeat(BATCH_SIZE, 1, 1, 1).cuda()

features_style = vgg(style_img)  # feature maps extracted from VGG
features_content = vgg(content_img)

gram_style = [utils.gram_matrix(y) for y in features_style]  # gram matrix of style feature

mse_loss = nn.MSELoss()

y = content_img.detach()  # y is the target output. Optimized start from the content image.
y = y.requires_grad_()  # let y required the grad

optimizer = optim.Adam([y], lr=LEARNING_RATE)  # let optimizer optimize the tensor y

print(" Start training ========================================")
for epoch in range(EPOCHS):

    def closure():
        optimizer.zero_grad()
        y.data.clamp_(0, 1)
        features_y = vgg(y)  # feature maps of y extracted from VGG
        gram_style_y = [utils.gram_matrix(i) for i in features_y]  # gram matrixs of feature_y in relu1_2,2_2,3_3,4_3

        fc = features_content.relu4_3  # content target in relu4_3
        fy = features_y.relu4_3  # y in relu4_3

        style_loss = 0  # add style_losses in relu1_2,2_2,3_3,4_3
        for fy_gm, fs_gm in zip(gram_style_y, gram_style):
            style_loss += mse_loss(fy_gm, fs_gm)
        style_loss = STYLE_WEIGHT * style_loss

        # fy_gm = gram_style_y[3]
        # fs_gm = gram_style[3]
        # style_loss = STYLE_WEIGHT * mse_loss(fy_gm, fs_gm)

        content_loss = CONTENT_WEIGHT * mse_loss(fc, fy)  # content loss

        total_loss = content_loss + style_loss
        total_loss.backward(retain_graph=True)

        if epoch % 100 == 0:
            print("Epoch {}: Style Loss : {:4f} Content Loss: {:4f}".format(epoch, style_loss, content_loss))
        if epoch % 1000 == 0:
            utils.save_image_epoch(y, './outputs/', epoch)
        return total_loss


    optimizer.step(closure)

**

实验结果

**
实验结果如下图2所示。

**图2**

该代码的输出可能更稳定一些,不会在某些epoch突然变成噪音,如图3所示。

PyTorch实现风格迁移(Style Transfer)始祖文章_第1张图片

**图3**

本人直接在迭代中计算损失,而不是在网络中构建损失模型。 本人移除了规范化,因为发现加不加效果差不多,并且使用了Adam优化器。

此代码的优化速度可能比其他PyTorch代码要慢一些,其需要更多的epoch去迭代优化。不清楚是什么原因。

代码已上传Github , 希望对大家有用~

你可能感兴趣的:(人工智能,神经网络,python,深度学习)