风格损失和内容损失的tf实现

有其他方面的论文可知,感知损失主要分为内容损失和风格损失。

其中内容损失主要是两个比较对象的L1或者l2范数。

而风格损失则主要是两个比较对象先求各自的gram矩阵,然后求L1或者l2范数。

在求gram矩阵时,可以按照以下理解:

内容content为vgg等网络提取出来的featuremap。大小为[b, h, w, c]。[批大小, 长,宽,通道数]

需要的gram矩阵由[b, c, hw] 与[b, hw, c]相乘得到为[b, c, c]

代码如下:

def ContentLoss(messageresult, compareresult):
    result = 0

    for x, y in zip(messageresult, compareresult):
        shape = x.get_shape().as_list()
        k = np.prod(shape[1:])
        diff = x - y
        diff = tf.norm(diff, ord=1) / k
        result = result + diff

    return result

# 求gram矩阵
def gram_matrix(input):
    # input [batch, h, w, c]
    input = tf.transpose(input, perm=[0, 3, 1, 2])  # input [batch, c, h, w]
    shape = input.get_shape().as_list()
    channel = shape[1]
    dim = np.prod(shape[2:])
    input = tf.reshape(input, [-1, channel, dim])   # input [batch, c, hw]
    # k 用来进行归一化
    k = channel * dim
    inputtemp = tf.transpose(input, perm=[0, 2, 1]) # input [batch, hw, c]
    result = tf.matmul(input, inputtemp) / k    # result [batch, c, c]
    return result, channel

def StyleLoss(messageresult, compareresult):
    result = 0

    for x, y in zip(messageresult, compareresult):
        X, cx = gram_matrix(x)
        Y, cy = gram_matrix(y)
        if cx != cy:
            print("The channel of feature map is not the same!")
            return 0;
        k = cx * cy
        diff = X - Y
        diff = tf.norm(diff, ord=1) / k   # 有的文献使用1, 有的使用2
        result = result + diff

    return result

 

你可能感兴趣的:(#,细碎的小技巧/常识/解决方案,tensorflow)