源码来自github:https://github.com/JuheonYi/VESPCN-tensorflow 中 ESPCN部分
首先简单的来看ESPCN的网络结构搭建 conv--conv--conv--ps
def network(self, LR):
feature_tmp = tf.layers.conv2d(LR, 64, 5, strides=1, padding='SAME', name='CONV_1',
kernel_initializer=tf.contrib.layers.xavier_initializer(), reuse=tf.AUTO_REUSE)
feature_tmp = tf.nn.relu(feature_tmp)
feature_tmp = tf.layers.conv2d(feature_tmp, 32, 3, strides=1, padding='SAME', name='CONV_2',
kernel_initializer=tf.contrib.layers.xavier_initializer(), reuse=tf.AUTO_REUSE)
feature_tmp = tf.nn.relu(feature_tmp)
feature_out = tf.layers.conv2d(feature_tmp, self.channels*self.scale*self.scale, 3, strides=1, padding='SAME',
name='CONV_3', kernel_initializer = tf.contrib.layers.xavier_initializer())
feature_out = PS(feature_out, self.scale, color=False)
feature_out = tf.layers.conv2d(feature_out, 1, 1, strides=1, padding='SAME',
name = 'CONV_OUT', kernel_initializer=tf.contrib.layers.xavier_initializer(), reuse=tf.AUTO_REUSE)
return feature_out
其中PS操作便是pixel shuffle
PS操作:其实就是将H * W * C * r * r ==> rH * rW * C 将其从H * W 放大为 rH * rW
def PS(X, r, color=False):
#print("Input X shape:",X.get_shape(),"scale:",r)
if color:
Xc = tf.split(X, 3, 3)
X = tf.concat([_phase_shift(x, r) for x in Xc], 3) #each of x in Xc is r * r channel 分别每一个通道变为r*r
else:
X = _phase_shift_1dim(X, r)
#print("output X shape:",X.get_shape())
return X
tf.split方法请移步tensorflow API:https://www.tensorflow.org/api_docs/python/tf/split 或者直接google
总之结果就是得到一个Xc(三通道,每一通道为H * W * r * r) 随后分辨遍历每一个通道 将r 与H W混合(shuffle)
具体操作:
def _phase_shift(I, r):
bsize, w, h, c = I.get_shape().as_list()
bsize = tf.shape(I)[0]
X = tf.reshape(I, (bsize, w, h, r, r))
X = tf.split(X, w, 1) #在w通道上分成了w份, 将每一维分成了1
#tf.squeeze删除axis上的1,然后在第三通道 即r通道上 将w个小x重新级联变成r * w
X = tf.concat([tf.squeeze(x, axis=1) for x in X], 2) #最终变成 bsize, h, r * w, r
X = tf.split(X, h, 1)
X = tf.concat([tf.squeeze(x, axis=1)for x in X], 2)
return tf.reshape(X, (bsize, w * r, h * r, 1)) #最后变成这个shape
def _phase_shift_1dim(I, r):
bsize, h, w, c = I.shape
bsize = I.shape[0]
X = tf.reshape(I, (bsize, h, w, r, r))
X = tf.split(X, w, 1)
X = tf.concat([tf.squeeze(x, axis=1) for x in X], 2)
X = tf.split(X, h, 1)
X = tf.concat([tf.squeeze(x, axis=1) for x in X], 2)
return tf.reshape(X, (bsize, w * r, h * r, 1))
其中重点 在split和concat中,这两步进行了pixel的拆分与重组 将a变为r * a ,b同理。
来自:https://github.com/drakelevy/ESPCN-TensorFlow
shuffle操作如下:
def shuffle(input_image, ratio):
shape = input_image.shape
height = int(shape[0]) * ratio
width = int(shape[1]) * ratio
channels = int(shape[2]) // ratio // ratio
shuffled = np.zeros((height, width, channels), dtype=np.uint8)
for i in range(0, height):
for j in range(0, width):
for k in range(0, channels):
#每一个像素 都是三通道叠加
shuffled[i,j,k] = input_image[i // ratio, j // ratio, k * ratio * ratio + (i % ratio) * ratio + (j % ratio)]
return shuffled
简单粗暴 直接打乱重组 直接根据原图拼接一张新图片(使用python的思想来理解,一个三维数组,分别对每一维度,即每一个数组进行处理),每一个像素点分别控制。
而在pytorch在中:官方提供了pixel shuffle方法:
CLASS torch.nn.PixelShuffle(upscale_factor)