上一篇文章,讲到用StyleGAN2 Encoder的encode_images.py将真实人脸图片映射到dlatents,并且使用人脸编辑向量让女朋友开心笑起来,内容请参考:
轻轻松松使用StyleGAN2(三):一笑倾人城,再笑倾人国:让你的女朋友开心笑起来
在StyleGAN2 Encoder的官网中,还使用了另外一种方法找到latents,并且使用起来也很简便,就是使用projector_images.py,它对StyleGAN2 projector做了一些优化(把dlatents空间从1x512扩展到18x512),实现的人脸重建效果有了很明显的提升。
用下面的图片举例:
原图 StyleGAN2Encoder 笑一笑 戴眼镜 变成小哥哥
具体操作如下:
(1)修改projector.py
# vgg16_pkl = 'https://drive.google.com/uc?id=1N2-m9qszOeVC9Tq77WxsLnuWwOedQiD2',
vgg16_pkl= '.\models\\vgg16_zhang_perceptual.pkl',
Perceptual Model “vgg16_zhang_perceptual.pkl”可以在百度网盘下载:
https://pan.baidu.com/s/1vP6NM9-w4s3Cy6l4T7QpbQ
提取码: 5qkp
下载以后,将文件copy到工作目录的“./models”下。
(2)在工作目录下创建临时目录(会在下面的“project_images.py”中用到),包括:
./stylegan2-tmp
./stylegan2-tmp/dataset
./stylegan2-tmp/video
(3)将放在“./raw_images”目录下原始图片,经过对齐、裁剪,放到“./aligned_images”目录下:
python align_images.py raw_images/ aligned_images/
(4)将“./aligned_images”目录下的人脸图片投射到dlatents空间,找到最佳的dlatents,保存dlatents文件和重建的人脸图片,文件放在“./generated_images”目录下:
python project_images.py aligned_images/ generated_images/ --vgg16-pkl='.\models\\vgg16_zhang_perceptual.pkl'
(5)这个向量空间较encode_images.py投射的空间似乎有所不同,因此对人脸编辑的代码稍作修改(修改了传入的coeffs),源代码如下,请参考:
import os
import pickle
import PIL.Image
import numpy as np
import dnnlib
import dnnlib.tflib as tflib
import config
from encoder.generator_model import Generator
import matplotlib.pyplot as plt
import glob
# pre-trained network.
Model = './models/stylegan2-ffhq-config-f.pkl'
synthesis_kwargs = dict(output_transform=dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True), minibatch_size=8)
_Gs_cache = dict()
# 加载StyleGAN已训练好的网络模型
def load_Gs(model):
if model not in _Gs_cache:
model_file = glob.glob(Model)
if len(model_file) == 1:
model_file = open(model_file[0], "rb")
else:
raise Exception('Failed to find the model')
_G, _D, Gs = pickle.load(model_file)
# _G = Instantaneous snapshot of the generator. Mainly useful for resuming a previous training run.
# _D = Instantaneous snapshot of the discriminator. Mainly useful for resuming a previous training run.
# Gs = Long-term average of the generator. Yields higher-quality results than the instantaneous snapshot.
# Print network details.
# Gs.print_layers()
_Gs_cache[model] = Gs
return _Gs_cache[model]
def generate_image(generator, latent_vector):
latent_vector = latent_vector.reshape((1, 18, 512))
generator.set_dlatents(latent_vector)
img_array = generator.generate_images()[0]
img = PIL.Image.fromarray(img_array, 'RGB')
return img.resize((256, 256))
def move_and_show(generator, flag, latent_vector, direction, coeffs):
fig,ax = plt.subplots(1, len(coeffs), figsize=(15, 10), dpi=80)
for i, coeff in enumerate(coeffs):
new_latent_vector = latent_vector.copy()
new_latent_vector[:8] = (latent_vector + coeff*direction)[:8]
ax[i].imshow(generate_image(generator, new_latent_vector))
ax[i].set_title('Coeff: %0.1f' % coeff)
[x.axis('off') for x in ax]
plt.show()
favor_coeff = float(input('Please input your favourate coeff, such as -1.5 or 1.5: '))
new_latent_vector = latent_vector.copy()
new_latent_vector[:8] = (latent_vector + favor_coeff*direction)[:8]
new_latent_vector = new_latent_vector.reshape((1, 18, 512))
generator.set_dlatents(new_latent_vector)
new_person_image = generator.generate_images()[0]
canvas = PIL.Image.new('RGB', (1024, 1024), 'white')
canvas.paste(PIL.Image.fromarray(new_person_image, 'RGB'), ((0, 0)))
if flag == 0:
filename = 'new_age.png'
if flag == 1:
filename = 'new_angle.png'
if flag == 2:
filename = 'new_gender.png'
if flag == 3:
filename = 'new_eyes.png'
if flag == 4:
filename = 'new_glasses.png'
if flag == 5:
filename = 'new_smile.png'
canvas.save(os.path.join(config.generated_dir, filename))
def main():
tflib.init_tf()
Gs_network = load_Gs(Model)
generator = Generator(Gs_network, batch_size=1, randomize_noise=False)
os.makedirs(config.dlatents_dir, exist_ok=True)
# person = np.load(os.path.join(config.dlatents_dir, 'Scarlett Johansson01_01.npy'))
person = np.load(os.path.join(config.generated_dir, 'Scarlett-Johansson_01.npy'))
# Loading already learned latent directions
age_direction = np.load('ffhq_dataset/latent_directions/age.npy')
angle_direction = np.load('ffhq_dataset/latent_directions/angle_horizontal.npy')
gender_direction = np.load('ffhq_dataset/latent_directions/gender.npy')
eyes_direction = np.load('ffhq_dataset/latent_directions/eyes_open.npy')
glasses_direction = np.load('ffhq_dataset/latent_directions/glasses.npy')
smile_direction = np.load('ffhq_dataset/latent_directions/smile.npy')
move_and_show(generator, 0, person, age_direction, [-20, -16, -12, -8, 0, 8, 12, 16, 20])
move_and_show(generator, 1, person, angle_direction, [-40, -32, -24, -16, 0, 16, 24, 32, 40])
move_and_show(generator, 2, person, gender_direction, [-20, -16, -12, -8, 0, 8, 12, 16, 20])
move_and_show(generator, 3, person, eyes_direction, [-8, -6, -4, -2, 0, 2, 4, 6, 8])
move_and_show(generator, 4, person, glasses_direction, [-16, -12, -8, -4, 0, 4, 8, 12, 16])
move_and_show(generator, 5, person, smile_direction, [-16, -12, -8, -4, 0, 4, 8, 12, 16])
if __name__ == "__main__":
main()
(完)