加载图片,调整大小为224×224
img_path_1 = './xxx/xxx.jpg'
img_path_2 = './xxx/xxx.jpg'
from PIL import Image
image_1 = Image.open(img_path_1)
image_1 = np.array(image_1)
image_2 = Image.open(img_path_2)
image_2 = np.array(image_2)
plt.imshow(image_1)
plt.show()
plt.imshow(image_2)
plt.show()
数组转为张量:
source_image_1 = np.expand_dims(image_1.transpose((2,0,1)),0)
source_image_1 = torch.Tensor(source_image_1.astype(np.float32)/255.0)
source_image_2 = np.expand_dims(image_2.transpose((2,0,1)),0)
source_image_2 = torch.Tensor(source_image_2.astype(np.float32)/255.0)
如何调整图像大小?此处只是权宜之计?
transformer_1 = GeometricTnf(out_h=224, out_w=224, use_cuda = False)
image_var_1 = transformer_1(source_image_1)
image_var_2 = transformer_1(source_image_2)
现在image_var_1和image_var_2的shape都变成了torch.Size([1, 3, 224, 224])。
将VGG16模型截取到第四个池化层,改造为特征提取模型。
即
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace=True)
(16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(18): ReLU(inplace=True)
(19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace=True)
(23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
截取到这里,后边的不用了。
将其实例化后提取特征图:
from model.cnn_geometric_model import FeatureExtraction
model_extraction = FeatureExtraction(feature_extraction_cnn='vgg',use_cuda = False)
feature_map_1 = model_extraction(image_var_1)
feature_map_2 = model_extraction(image_var_2)
提取出的特征图(即feature_map_1与feature_map_2)的shape为:torch.Size([1, 512, 14, 14])
这与VGG16模型的架构是相符的。
让我困惑的是接下来的特征匹配与特征回归。
特征匹配:
from model.cnn_geometric_model import FeatureCorrelation
model_fc = FeatureCorrelation()
correlation_tensor = model_fc(feature_map_1, feature_map_2)
得到的匹配correlation_tensor的shape为:torch.Size([1, 196, 14, 14])。可是,这与接下来要输入特征回归网络的shape是不符的?
为什么得到的通道数是196呢?
错误提示说的是:期望我们输入回归网络的是[128, 225, 7, 7],然而我们输入的却是[1, 196, 14, 14]。
问题出在了哪里?