pytorch搭建孪生网络比较人脸相似性

参考文献:

神经网络学习小记录52——Pytorch搭建孪生神经网络(Siamese network)比较图片相似性_Bubbliiiing的博客-CSDN博客_神经网络图片相似性

Python - 深度学习系列2-人脸比对 Siamese_yukai08008的博客-CSDN博客

1.孪生网络

孪生神经网络(Siamese network)即“连体的神经网络”,

神经网络的“连体”是通过共享权值来实现的,如图所示。

pytorch搭建孪生网络比较人脸相似性_第1张图片

孪生神经网络有两个输入(Input1 and Input2),利用特征提取网络将输入映射到新的空间,形成输入在新的空间中的表示。然后对得到的两个输出进行相减,得到新的输出,并进行全连接层分类,最后输出一个向量,再通过Sigmoid函数将其转化到0-1之间,该值即为两个输入的相似度。

2.孪生网络

(1)特征提取部分

本孪生网络采用vgg16的features作为特征提取网络,提取完后将两个向量展平,便于相减得到新的向量并进行全连接层分类。

代码实现:


vgg16 = models.vgg16(pretrained=True)
# 获取VGG16的特征提取层
vgg = vgg16.features


class SiameseNetwork(nn.Module):
    def __init__(self, input_shape):
        super(SiameseNetwork, self).__init__()

        self.vgg = vgg

       

    def forward_once(self, x):
        output = self.vgg(x)
        output = torch.flatten(output, 1)
       

    def forward(self, input1, input2):
        output1 = self.forward_once(input1)
        output2 = self.forward_once(input2)
       

这里最好不要将features的权重冻结,因为这样不能很好提取我们所需图片的特征,泛化能力也不好。

(2)全连接层

将得到的两个输出(output1和output2)进行相减,得到output,并对output进行全连接层,注意:其展平长度需通过计算得出,最后通过三个全连接层得到一个输出通道,并采取Sigmoid将其范围控制在0到1之间。(由于我们使用的损失函数是BCEWithLogitsLoss,即进行损失计算前会对预测值进行Sigmoid,因此在这里我们就不加Sigmoid)

代码实现:



def get_img_output_length(width, height):
    def get_output_length(input_length):
        # input_length += 6
        filter_sizes = [2, 2, 2, 2, 2]
        padding = [0, 0, 0, 0, 0]
        stride = 2
        for i in range(5):
            input_length = (input_length + 2 * padding[i] - filter_sizes[i]) // stride + 1
        return input_length

    return get_output_length(width) * get_output_length(height)


class SiameseNetwork(nn.Module):
    def __init__(self, input_shape):
        super(SiameseNetwork, self).__init__()

        

        flat_shape = 512 * get_img_output_length(input_shape[1], input_shape[0])
        # flat_shape = 1000
        self.fc = nn.Sequential(
            nn.Linear(flat_shape, 512),
            nn.ReLU(inplace=True),

            nn.Linear(512, 256),
            nn.ReLU(inplace=True),

            nn.Linear(256, 1))


    def forward(self, input1, input2):
      
        output = output1 - output2
        output = self.fc(output)
        # output = nn.Sigmoid(output)
        return output

3.标签的生成

对于相似的图片,我们标签为1;对于不同的图片,我们将标签设置为0

代码实现:

class SiameseNetworkDataset(Dataset):

    def __init__(self, imageFolderDataset, transform=None, should_invert=True):
        self.imageFolderDataset = imageFolderDataset
        self.transform = transform
        self.should_invert = should_invert

    def __getitem__(self, index):
        img0_tuple = random.choice(self.imageFolderDataset.imgs)
        # we need to make sure approx 50% of images are in the same class
        should_get_same_class = random.randint(0, 1)
        if should_get_same_class:
            while True:
                # keep looping till the same class image is found
                img1_tuple = random.choice(self.imageFolderDataset.imgs)
                if img0_tuple[1] == img1_tuple[1]:
                    break
        else:
            while True:
                # keep looping till a different class image is found

                img1_tuple = random.choice(self.imageFolderDataset.imgs)
                if img0_tuple[1] != img1_tuple[1]:
                    break

        img0 = Image.open(img0_tuple[0])
        img1 = Image.open(img1_tuple[0])
        img0 = img0.convert("RGB")
        img1 = img1.convert("RGB")

        if self.should_invert:
            img0 = PIL.ImageOps.invert(img0)
            img1 = PIL.ImageOps.invert(img1)

        if self.transform is not None:
            img0 = self.transform(img0)
            img1 = self.transform(img1)

        return img0, img1, torch.from_numpy(np.array([int(img1_tuple[1] == img0_tuple[1])], dtype=np.float32))

    def __len__(self):
        return len(self.imageFolderDataset.imgs)

4.损失函数和优化器

criterion = torch.nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(net.parameters(), 0.001, betas=(0.9, 0.999))

5.训练过程

(1)参数设置:


training_dir = r"D:\Siamese_for_Face\data\faces\training"
train_batch_size = 16
train_number_epochs = 200
input_shape = [224, 224]

(2)数据集加载


transform = transforms.Compose([transforms.Resize((224, 224)),
                                transforms.ToTensor()])
folder_dataset = dset.ImageFolder(root=training_dir)
siamese_dataset = SiameseNetworkDataset(imageFolderDataset=folder_dataset,
                                        transform=transform,
                                        should_invert=False)
train_dataloader = DataLoader(siamese_dataset,
                              shuffle=True,
                              num_workers=0,
                              batch_size=train_batch_size)

(3)网络的加载并移到GPU训练

net = SiameseNetwork(input_shape)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net.to(device)

(4)训练循环

counter = []
loss_history = []
iteration_number = 0

if __name__ == '__main__':

    for epoch in range(0, train_number_epochs):
        for i, data in enumerate(train_dataloader, 0):
            img0, img1, label = data
            img0, img1, label = img0.to(device), img1.to(device), label.to(device)
            optimizer.zero_grad()
            output = net(img0, img1)
            loss_contrastive = criterion(output, label)
            loss_contrastive.backward()
            optimizer.step()
            if i % 10 == 0:
                print("Epoch number {}\n Current loss {}\n".format(epoch, loss_contrastive.item()))
                iteration_number += 10
                counter.append(iteration_number)
                loss_history.append(loss_contrastive.item())
    plt.plot(counter, loss_history)
    plt.show()
    torch.save(net.state_dict(), 'weights/vgg.pkl')

6.测试过程

(1)展示图片


def imshow(img, text=None, should_save=False):
    npimg = img.numpy()
    plt.axis("off")
    if text:
        plt.text(75, 8, text, style='italic', fontweight='bold',
                 bbox={'facecolor': 'white', 'alpha': 0.8, 'pad': 10})
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()

(2)参数设置


testing_dir = r"D:\Siamese_for_Face\data\faces\testing"
input_shape = [224, 224]

(3)加载数据集

testing_dir = r"D:\Siamese_for_Face\data\faces\testing"
input_shape = [224, 224]

transform = transforms.Compose([transforms.Resize((224, 224)),
                                transforms.ToTensor()])
folder_dataset_test = dset.ImageFolder(testing_dir)
siamese_dataset = SiameseNetworkDataset(imageFolderDataset=folder_dataset_test,
                                        transform=transform,
                                        should_invert=False)
test_dataloader = DataLoader(siamese_dataset, num_workers=0, batch_size=1, shuffle=True)

(4)加载网络和训练过的权重

net = SiameseNetwork(input_shape)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net.to(device)
net.load_state_dict(torch.load(r'D:\Siamese_for_Face\weights\vgg.pkl'))

(5)测试过程

if __name__ == '__main__':

    dataiter = iter(test_dataloader)
    x0, _, _ = next(dataiter)

    for i in range(10):
        _, x1, label2 = next(dataiter)
        x0, x1, label2 = x0.to(device), x1.to(device), label2.to(device)
        concatenated = torch.cat((x0, x1), 0)
        # output1, output2 = net(Variable(x0), Variable(x1))
        output = net(Variable(x0), Variable(x1))[0]
        output = torch.nn.Sigmoid()(output)
        # euclidean_distance = F.pairwise_distance(output1, output2)
        imshow(torchvision.utils.make_grid(concatenated).cpu(),
               'similarity: {:.2f}'.format(output.item()))

7.网络的效果

pytorch搭建孪生网络比较人脸相似性_第2张图片

 pytorch搭建孪生网络比较人脸相似性_第3张图片

这里我是设置了0.0005的学习率和400个epochs

感觉最后训练的效果很好,说明vgg16网络的features的特征提取能力很强大,这里要注意的是,不要设置太大的学习率,因为我们这是迁移学习,主要是利用vgg16特征提取的权重,设置太大的学习会将原本训练好的vgg16的权重扭曲太多。 

8.代码

(1)gitee

Siamese_for_Face.zip · xuxuxuxu/xuxuxuxu - 码云 - 开源中国 (gitee.com)

(2)github

xuxuxuxu/Siamese_for_Face.zip at main · xuxuxuxuxuxu97/xuxuxuxu (github.com)

你可能感兴趣的:(人工智能,深度学习,图像相似算法,pytorch,深度学习,人工智能)