人脸检测MTCNN和人脸识别Facenet(二)

一、主要函数
align/ :用于人脸检测与人脸对齐的神经网络
facenet :用于人脸映射的神经网络
util/plot_learning_curves.m:这是用来在训练softmax模型的时候用matlab显示训练过程的程序

二、facenet/contributed/相关函数:
1、基于mtcnn与facenet的人脸聚类
代码:facenet/contributed/cluster.py(facenet/contributed/clustering.py实现了相似的功能,只是没有mtcnn进行检测这一步)
主要功能:
① 使用mtcnn进行人脸检测并对齐与裁剪
② 对裁剪的人脸使用facenet进行embedding
③ 对embedding的特征向量使用欧式距离进行聚类
2、基于mtcnn与facenet的人脸识别(输入单张图片判断这人是谁)
代码:facenet/contributed/predict.py
主要功能:
① 使用mtcnn进行人脸检测并对齐与裁剪
② 对裁剪的人脸使用facenet进行embedding
③ 执行predict.py进行人脸识别(需要训练好的svm模型)
3、以numpy数组的形式输出人脸聚类和图像标签
代码:facenet/contributed/export_embeddings.py
主要功能:
① 需要对数据进行对齐与裁剪做为输入数据
② 输出embeddings.npy;labels.npy;label_strings.npy

下面我们会介绍一个经典的人脸识别系统–谷歌人脸识别系统facenet,该网络主要包含两部分:

  • MTCNN部分:用于人脸检测和人脸对齐,输出160×160大小的图像;
  • CNN部分:可以直接将人脸图像(默认输入是160×160大小)映射到欧几里得空间,空间距离的长度代表了人脸图像的相似性。只要该映射空间生成、人脸识别,验证和聚类等任务就可以轻松完成;

先去GitHub下载facenet源码:https://github.com/davidsandberg/facenet,解压后如下图所示:
人脸检测MTCNN和人脸识别Facenet(二)_第1张图片

1、导入所需的包

import tensorflow as tf
import sklearn
import scipy
import cv2
import h5py
import matplotlib
import PIL
import requests
import psutil

如若哪个包没有报错,对应安装上就好了。
pip install 包名,单独安装或者如下安装requirements.txt依赖

pip install -r requirements.txt

我自己使用的是PyCharm+ Anaconda3-5.2.0-Windows-64+tensorflow-gpu1.14(facenet代码中用的tensorflow是1.7,注意这个tensorflow的版本,后面会遇到一个问题,真是自己作死啊)

2、配置facenet环境

将src文件夹添加到环境变量PYTHONPATH(临时的环境变量),若要设置永久的环境变量,可以到计算机——属性——高级系统设置——环境变量——系统变量——path,将路径添加到path中。添加环境变量是为了系统在当前路径下找不到你需要的模块时,会从环境变量路径中搜索。关于环境变量的添加具体可参考这篇博客
https://blog.csdn.net/Tona_ZM/article/details/79463284

3、下载LFW数据集

下载LFW数据集
接下来将会讲解如何使用已经训练好的模型在LFW(Labeled Faces in the Wild)数据库上测试,不过我还需要先来介绍一下LFW数据集。
LFW数据集是由美国马赛诸塞大学阿姆斯特分校计算机实验室整理的人脸检测数据集,是评估人脸识别算法效果的公开测试数据集。LFW数据集共有13233张jpeg格式图片,属于5749个不同的人,其中有1680人对应不止一张图片,每张图片尺寸都是250×250,并且被标示出对应的人的名字。LFW数据集中每张图片命名方式为"lfw/name/name_xxx.jpg",这里"xxx"是前面补零的四位图片编号。例如,前美国总统乔治布什的第十张图片为"lfw/George_W_Bush/George_W_Bush_0010.jpg"。
数据集的下载地址为:http://vis-www.cs.umass.edu/lfw/lfw.tgz,下载完成后,解压数据集,打开打开其中一个文件夹,如下:
人脸检测MTCNN和人脸识别Facenet(二)_第2张图片

在lfw下新建一个文件夹raw,把lfw中所有的文件(除了raw)移到raw文件夹中。可以看到我的数据集lfw是放在datasets文件夹下,其中datasets文件夹和facenet是在同一路径下。

4、LFW数据集预处理(LFW数据库上的人脸检测和对齐)

我们需要将检测所使用的数据集校准为和训练模型所使用的数据集大小一致(160×160),转换后的数据集存储在lfw_mtcnnpy_160文件夹内,处理的第一步是使用MTCNN网络进行人脸检测和对齐,并缩放到160×160。
MTCNN的实现主要在文件夹facenet/src/align中,文件夹的内容如下:
人脸检测MTCNN和人脸识别Facenet(二)_第3张图片

  • detect_face.py:定义了MTCNN的模型结构,由P-Net、R-Net、O-Net组成,这三个网络已经提供了预训练的模型,模型数据分别对应文件det1.npy、det2.npy、det3.npy。
  • align_dataset_matcnn.py:是使用MTCNN的模型进行人脸检测和对齐的入口代码。

使用脚本align_dataset_mtcnn.py对LFW数据库进行人脸检测和对齐的方法通过运行命令,我们打开Anaconda Prompt(或者直接点击下方Terminal并在其中运行命令),来到facenet所在的路径下,运行如下命令:

python src/align/align_dataset_mtcnn.py datasets/lfw/raw  datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction=0.25   #没装GPU版tensorflow不加最后这一句--gpu_memory_fraction=0.25

若使用相对路径报错全都换成绝对路径再试试。

如果在PyCharm中运行,可以如下设置好参数后直接运行align_dataset_matcnn.py

Parameters设置为D:\program\facenet\datasets\lfw\raw D:\program\facenet\datasets\lfw\lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction=0.25margin 32 --random_order --gpu_memory_fraction=0.25

Environment Variables设置为PYTHONUNBUFFERED=1;PYTHONPATH=D:\program\facenet\src

Working directory设置为D:\program\facenet\src
运行结果如下图:
人脸检测MTCNN和人脸识别Facenet(二)_第4张图片
该命令会创建一个datasets/lfw/lfw_mtcnnpy_160的文件夹,并将所有对齐好的人脸图像存放到这个文件夹中,数据的结构和原先的datasets/lfw/raw一样。参数–image_size 160 --margin 32的含义是在MTCNN检测得到的人脸框的基础上缩小32像素(训练时使用的数据偏大),并缩放到160×160大小,因此最后得到的对齐后的图像都是160×160像素的,这样的话,就成功地从原始图像中检测并对齐了人脸。人脸检测MTCNN和人脸识别Facenet(二)_第5张图片
遇到的问题及解决办法

  • 运行align_detect_mtcnn.py提示No module named ‘align.detect_face’
    人脸检测MTCNN和人脸识别Facenet(二)_第6张图片
    解决办法:
    Parameters参数设置为
D:\program\facenet\datasets\lfw\raw  D:\program\facenet\datasets\lfw\lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction=0.25

选择run→Edit Configuration,在Environment variables中如下图设置,PYTHONUNBUFFERED=1;PYTHONPATH=D:\program\facenet\src。
人脸检测MTCNN和人脸识别Facenet(二)_第7张图片

  • 2、运行align_detect_mtcnn.py提示[Error 13]Permission denied
    在这里插入图片描述
    解决方案:
    (1)检查对应路径下的文件是否存在,且被占用。如果文件存在,被占用,将占用程序暂时关闭。
    (2)修改cmd的权限,以管理员身份运行。
    (3)检查是否是打开了文件夹。

    上面几个都是网上的方法。我的错误是打开文件夹导致的,在Parameter中写入LFW数据集一定要写到以人物名字命名的文件夹的上一级,比如我就是漏了\raw这一级,数据集中raw目录下存放着所有人物图片,如果在Parameter中LFW路径只写到lfw这一层,程序只能读取其目录下的raw,不能读取raw目录下的所有图片(犯这么低级的错误,引以为戒啊)。
    其中Parameters设置为D:\program\facenet\datasets\lfw\raw
    D:\program\facenet\datasets\lfw\lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction=0.25,分别是FLW数据集的存放路径、人脸对齐后的输出路径、先进行了边缘扩展32个像素)以及缩放(缩放到160×160160×160)、占用GPU比例。
    人脸检测MTCNN和人脸识别Facenet(二)_第8张图片

下面我们来简略的分析一下align_dataset_mtcnn.py源文件,先上源代码如下,然后我们来解读一下main()函数.

  • 首先加载LFW数据集;
  • 建立MTCNN网络,并预训练(即使用训练好的网络初始化参数),Google
    Facenet的作者在建立网络时,自己重写了CNN网络所需的各个组件,包括conv层,MaxPool层,Softmax层等等,由于作者写的比较复杂。有兴趣的同学看看MTCNN 的 TensorFlow实现这篇博客,博主使用Keras重新实现了MTCNN网络,也比较好懂代码链接:https://github.com/FortiLeiZhang/model_zoo/tree/master/TensorFlow/mtcnn;
  • 调用align.detect_face.detect_face()函数进行人脸检测,返回校准后的人脸边界框的位置、score、以及关键点坐标;
  • 对人脸框进行处理,从原图中裁切(先进行了边缘扩展32个像素)、以及缩放(缩放到160×160)等,并保存相关信息到文件;

关于人脸检测的具体细节可以查看detect_face()函数,具体细节部分可以参考MTCNN 的 TensorFlow 实现这篇博客。

"""Performs face alignment and stores face thumbnails in the output directory."""
# MIT License
# 
# Copyright (c) 2016 David Sandberg
# 
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# 
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# 
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from scipy import misc
import sys
import os
import argparse
import tensorflow as tf
import numpy as np
import facenet
import align.detect_face
import random
from time import sleep



'''
使用MTCNN网络进行人脸检测和对齐
'''

def main(args):
    '''
    args:
        args:参数,关键字参数
    '''
    
    sleep(random.random())
    #设置对齐后的人脸图像存放的路径
    output_dir = os.path.expanduser(args.output_dir)
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    # Store some git revision info in a text file in the log directory  保存一些配置参数等信息
    src_path,_ = os.path.split(os.path.realpath(__file__))
    facenet.store_revision_info(src_path, output_dir, ' '.join(sys.argv))
    
    '''1、获取LFW数据集 获取每个类别名称以及该类别下所有图片的绝对路径'''
    dataset = facenet.get_dataset(args.input_dir)
    
    print('Creating networks and loading parameters')
    
    '''2、建立MTCNN网络,并预训练(即使用训练好的网络初始化参数)'''
    with tf.Graph().as_default():
        gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction)
        sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False))
        with sess.as_default():
            pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None)
    
    minsize = 20                   # minimum size of face
    threshold = [ 0.6, 0.7, 0.7 ]  # three steps's threshold
    factor = 0.709                 # scale factor

    # Add a random key to the filename to allow alignment using multiple processes
    random_key = np.random.randint(0, high=99999)
    bounding_boxes_filename = os.path.join(output_dir, 'bounding_boxes_%05d.txt' % random_key)
    
    '''3、每个图片中人脸所在的边界框写入记录文件中'''
    with open(bounding_boxes_filename, "w") as text_file:
        nrof_images_total = 0
        nrof_successfully_aligned = 0
        if args.random_order:
            random.shuffle(dataset)
        #获取每一个人,以及对应的所有图片的绝对路径
        for cls in dataset:
            #每一个人对应的输出文件夹
            output_class_dir = os.path.join(output_dir, cls.name)
            if not os.path.exists(output_class_dir):
                os.makedirs(output_class_dir)
                if args.random_order:
                    random.shuffle(cls.image_paths)
            #遍历每一张图片
            for image_path in cls.image_paths:
                nrof_images_total += 1
                filename = os.path.splitext(os.path.split(image_path)[1])[0]
                output_filename = os.path.join(output_class_dir, filename+'.png')
                print(image_path)
                if not os.path.exists(output_filename):
                    try:
                        img = misc.imread(image_path)
                    except (IOError, ValueError, IndexError) as e:
                        errorMessage = '{}: {}'.format(image_path, e)
                        print(errorMessage)
                    else:
                        if img.ndim<2:
                            print('Unable to align "%s"' % image_path)
                            text_file.write('%s\n' % (output_filename))
                            continue
                        if img.ndim == 2:
                            img = facenet.to_rgb(img)
                        img = img[:,:,0:3]
    
                        #人脸检测 bounding_boxes:表示边界框 形状为[n,5] 5对应x1,y1,x2,y2,score
                        #_:人脸关键点坐标 形状为 [n,10]
                        bounding_boxes, _ = align.detect_face.detect_face(img, minsize, pnet, rnet, onet, threshold, factor)
                        #边界框个数
                        nrof_faces = bounding_boxes.shape[0]
                        if nrof_faces>0:
                            #[n,4] 人脸框
                            det = bounding_boxes[:,0:4]
                            #保存所有人脸框
                            det_arr = []
                            img_size = np.asarray(img.shape)[0:2]
                            if nrof_faces>1:
                                #一张图片中检测多个人脸
                                if args.detect_multiple_faces:
                                    for i in range(nrof_faces):
                                        det_arr.append(np.squeeze(det[i]))
                                else:
                                    bounding_box_size = (det[:,2]-det[:,0])*(det[:,3]-det[:,1])
                                    img_center = img_size / 2
                                    offsets = np.vstack([ (det[:,0]+det[:,2])/2-img_center[1], (det[:,1]+det[:,3])/2-img_center[0] ])
                                    offset_dist_squared = np.sum(np.power(offsets,2.0),0)
                                    index = np.argmax(bounding_box_size-offset_dist_squared*2.0) # some extra weight on the centering
                                    det_arr.append(det[index,:])
                            else:
                                #只有一个人脸框
                                det_arr.append(np.squeeze(det))

                            #遍历每一个人脸框
                            for i, det in enumerate(det_arr):
                                #[4,]  边界框扩大margin区域,并进行裁切
                                det = np.squeeze(det)
                                bb = np.zeros(4, dtype=np.int32)
                                bb[0] = np.maximum(det[0]-args.margin/2, 0)
                                bb[1] = np.maximum(det[1]-args.margin/2, 0)
                                bb[2] = np.minimum(det[2]+args.margin/2, img_size[1])
                                bb[3] = np.minimum(det[3]+args.margin/2, img_size[0])
                                cropped = img[bb[1]:bb[3],bb[0]:bb[2],:]
                                #缩放到指定大小,并保存图片,以及边界框位置信息
                                scaled = misc.imresize(cropped, (args.image_size, args.image_size), interp='bilinear')
                                nrof_successfully_aligned += 1
                                filename_base, file_extension = os.path.splitext(output_filename)
                                if args.detect_multiple_faces:
                                    output_filename_n = "{}_{}{}".format(filename_base, i, file_extension)
                                else:
                                    output_filename_n = "{}{}".format(filename_base, file_extension)
                                misc.imsave(output_filename_n, scaled)
                                text_file.write('%s %d %d %d %d\n' % (output_filename_n, bb[0], bb[1], bb[2], bb[3]))
                        else:
                            print('Unable to align "%s"' % image_path)
                            text_file.write('%s\n' % (output_filename))
                            
    print('Total number of images: %d' % nrof_images_total)
    print('Number of successfully aligned images: %d' % nrof_successfully_aligned)
            

def parse_arguments(argv):
    '''
    解析命令行参数
    '''
    parser = argparse.ArgumentParser()
        
    #定义参数  input_dir、output_dir为外部参数名
    parser.add_argument('input_dir', type=str, help='Directory with unaligned images.')
    parser.add_argument('output_dir', type=str, help='Directory with aligned face thumbnails.')
    parser.add_argument('--image_size', type=int,
        help='Image size (height, width) in pixels.', default=160)
    parser.add_argument('--margin', type=int,
        help='Margin for the crop around the bounding box (height, width) in pixels.', default=32)
    parser.add_argument('--random_order', 
        help='Shuffles the order of images to enable alignment using multiple processes.', action='store_true')
    parser.add_argument('--gpu_memory_fraction', type=float,
        help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0)
    parser.add_argument('--detect_multiple_faces', type=bool,
                        help='Detect and align multiple faces per image.', default=False)
    #解析
    return parser.parse_args(argv)

if __name__ == '__main__':
    main(parse_arguments(sys.argv[1:]))

5、使用已有模型验证LFW数据集准确率

项目的原作者提供了两个预训练的模型,分别是基于CASIA-WebFace和VGGFace2人脸库训练的,下载地址:https://github.com/davidsandberg/facenet:
人脸检测MTCNN和人脸识别Facenet(二)_第9张图片
注意:这两个模型文件需要才能够下载!!!!!!
这里我们使用的预训练模型是基于数据集VGGFace2的,并且使用的卷积网络结构是Inception ResNet v1,训练好的模型在LFW上可以达到99.05%左右的准确率。下载好模型后,将文件解压到facenet/models文件夹下(models文件夹需要自己新建)。解压后,会得到一个20180402-114759的文件夹,里面包含四个文件:
人脸检测MTCNN和人脸识别Facenet(二)_第10张图片

  • model.meta:模型文件,该文件保存了metagraph信息,即计算图的结构;
  • model.ckpt.data:权重文件,该文件保存了graph中所有遍历的数据;
  • model.ckpt.index:该文件保存了如何将meta和data匹配起来的信息;
  • pb文件:将模型文件和权重文件整合合并为一个文件,主要用途是便于发布,详细内容可以参考博客https://blog.csdn.net/yjl9122/article/details/78341689;
  • 一般情况下还会有个checkpoint文件,用于保存文件的绝对路径,告诉TF最新的检查点文件(也就是上图中后三个文件)是哪个,保存在哪里,在使用tf.train.latest_checkpoint加载的时候要用到这个信息,但是如果改变或者删除了文件中保存的路径,那么加载的时候会出错,找不到文件;

到这里、我们的准备工作已经基本完成,测试数据集LFW,模型、程序都有了,我们接下来开始评估模型的准确率。
我们打开Anaconda Prompt,来到facenet路径下(注意这里是facenet路径下。如何来到facenet路径下?打开Anaconda Prompt,输入activate your_environment切换到你的运行环境,之后如图所示,其中D为facenet代码所在盘符。),
人脸检测MTCNN和人脸识别Facenet(二)_第11张图片
然后运行如下命令:

python src/validate_on_lfw.py  datasets/lfw/lfw_mtcnnpy_160/ models/20180402-114759

也可以在设置好参数后直接在PyCharm中运行validate_on_lfw.py文件

  • validate_on_lfw.py参数设置
    人脸检测MTCNN和人脸识别Facenet(二)_第12张图片
  • 这是tensorflow版本的原因,我使用的是 tensorflow-gpu1.14版本的,作者的预训练模型是在tensorflow1.7版本训练的,所以在导入graph时会出错。出现如下错误:
    在这里插入图片描述
    解决方法:
    (1)把Tensorflow换为1.7版本的;
    (2)在facenet.py代码中找到create_input_pipeline 再添加一行语句 with tf.name_scope(“tempscope”): 就可以完美解决(貌似Tensorflow 1.10及以上版本才修复这个bug)。
    人脸检测MTCNN和人脸识别Facenet(二)_第13张图片
    改好之后, 再重新执行python代码。准确率达到了 0.98467±0.00407,打印如下
    人脸检测MTCNN和人脸识别Facenet(二)_第14张图片
    由此,我们验证了模型在LFW上的准确率为98.467%。validate_on_lfw.py源码如下:
"""Validate a face recognizer on the "Labeled Faces in the Wild" dataset (http://vis-www.cs.umass.edu/lfw/).
Embeddings are calculated using the pairs from http://vis-www.cs.umass.edu/lfw/pairs.txt and the ROC curve
is calculated and plotted. Both the model metagraph and the model parameters need to exist
in the same directory, and the metagraph should have the extension '.meta'.
"""
# MIT License
# 
# Copyright (c) 2016 David Sandberg
# 
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# 
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# 
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf
import numpy as np
import argparse
import facenet
import lfw
import os
import sys
from tensorflow.python.ops import data_flow_ops
from sklearn import metrics
from scipy.optimize import brentq
from scipy import interpolate

def main(args):
  
    with tf.Graph().as_default():
      
        with tf.Session() as sess:
            
            # Read the file containing the pairs used for testing  list 
            #每个元素如下:同一个人[Abel_Pacheco    1    4]  不同人[Ben_Kingsley    1    Daryl_Hannah    1]
            pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs))

            # Get the paths for the corresponding images
            # 获取测试图片的路径,actual_issame表示是否是同一个人
            paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs)
            
            #定义占位符
            image_paths_placeholder = tf.placeholder(tf.string, shape=(None,1), name='image_paths')
            labels_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='labels')
            batch_size_placeholder = tf.placeholder(tf.int32, name='batch_size')
            control_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='control')
            phase_train_placeholder = tf.placeholder(tf.bool, name='phase_train')
 
            #使用队列机制读取数据
            nrof_preprocess_threads = 4
            image_size = (args.image_size, args.image_size)
            eval_input_queue = data_flow_ops.FIFOQueue(capacity=2000000,
                                        dtypes=[tf.string, tf.int32, tf.int32],
                                        shapes=[(1,), (1,), (1,)],
                                        shared_name=None, name=None)
            eval_enqueue_op = eval_input_queue.enqueue_many([image_paths_placeholder, labels_placeholder, control_placeholder], name='eval_enqueue_op')
            image_batch, label_batch = facenet.create_input_pipeline(eval_input_queue, image_size, nrof_preprocess_threads, batch_size_placeholder)
     
            # Load the model
            input_map = {'image_batch': image_batch, 'label_batch': label_batch, 'phase_train': phase_train_placeholder}
            facenet.load_model(args.model, input_map=input_map)

            # Get output tensor
            embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
#              
            #创建一个协调器,管理线程
            coord = tf.train.Coordinator()
            tf.train.start_queue_runners(coord=coord, sess=sess)

            #开始评估
            evaluate(sess, eval_enqueue_op, image_paths_placeholder, labels_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder,
                embeddings, label_batch, paths, actual_issame, args.lfw_batch_size, args.lfw_nrof_folds, args.distance_metric, args.subtract_mean,
                args.use_flipped_images, args.use_fixed_image_standardization)

              
def evaluate(sess, enqueue_op, image_paths_placeholder, labels_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder,
        embeddings, labels, image_paths, actual_issame, batch_size, nrof_folds, distance_metric, subtract_mean, use_flipped_images, use_fixed_image_standardization):
    # Run forward pass to calculate embeddings
    print('Runnning forward pass on LFW images')
    
    # Enqueue one epoch of image paths and labels
    nrof_embeddings = len(actual_issame)*2  # nrof_pairs * nrof_images_per_pair
    nrof_flips = 2 if use_flipped_images else 1
    nrof_images = nrof_embeddings * nrof_flips
    labels_array = np.expand_dims(np.arange(0,nrof_images),1)
    image_paths_array = np.expand_dims(np.repeat(np.array(image_paths),nrof_flips),1)
    control_array = np.zeros_like(labels_array, np.int32)
    if use_fixed_image_standardization:
        control_array += np.ones_like(labels_array)*facenet.FIXED_STANDARDIZATION
    if use_flipped_images:
        # Flip every second image
        control_array += (labels_array % 2)*facenet.FLIP
    sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array})
    
    embedding_size = int(embeddings.get_shape()[1])
    assert nrof_images % batch_size == 0, 'The number of LFW images must be an integer multiple of the LFW batch size'
    nrof_batches = nrof_images // batch_size
    emb_array = np.zeros((nrof_images, embedding_size))
    lab_array = np.zeros((nrof_images,))
    for i in range(nrof_batches):
        feed_dict = {phase_train_placeholder:False, batch_size_placeholder:batch_size}
        emb, lab = sess.run([embeddings, labels], feed_dict=feed_dict)
        lab_array[lab] = lab
        emb_array[lab, :] = emb
        if i % 10 == 9:
            print('.', end='')
            sys.stdout.flush()
    print('')
    embeddings = np.zeros((nrof_embeddings, embedding_size*nrof_flips))
    if use_flipped_images:
        # Concatenate embeddings for flipped and non flipped version of the images
        embeddings[:,:embedding_size] = emb_array[0::2,:]
        embeddings[:,embedding_size:] = emb_array[1::2,:]
    else:
        embeddings = emb_array

    assert np.array_equal(lab_array, np.arange(nrof_images))==True, 'Wrong labels used for evaluation, possibly caused by training examples left in the input pipeline'
    tpr, fpr, accuracy, val, val_std, far = lfw.evaluate(embeddings, actual_issame, nrof_folds=nrof_folds, distance_metric=distance_metric, subtract_mean=subtract_mean)
    
    print('Accuracy: %2.5f+-%2.5f' % (np.mean(accuracy), np.std(accuracy)))
    print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far))
    
    auc = metrics.auc(fpr, tpr)
    print('Area Under Curve (AUC): %1.3f' % auc)
    eer = brentq(lambda x: 1. - x - interpolate.interp1d(fpr, tpr)(x), 0., 1.)
    print('Equal Error Rate (EER): %1.3f' % eer)
    
def parse_arguments(argv):
    '''
    参数解析
    '''
    parser = argparse.ArgumentParser()
    
    parser.add_argument('lfw_dir', type=str,
        help='Path to the data directory containing aligned LFW face patches.')
    parser.add_argument('--lfw_batch_size', type=int,
        help='Number of images to process in a batch in the LFW test set.', default=100)
    parser.add_argument('model', type=str, 
        help='Could be either a directory containing the meta_file and ckpt_file or a model protobuf (.pb) file')
    parser.add_argument('--image_size', type=int,
        help='Image size (height, width) in pixels.', default=160)
    parser.add_argument('--lfw_pairs', type=str,
        help='The file containing the pairs to use for validation.', default='data/pairs.txt')
    parser.add_argument('--lfw_nrof_folds', type=int,
        help='Number of folds to use for cross validation. Mainly used for testing.', default=10)
    parser.add_argument('--distance_metric', type=int,
        help='Distance metric  0:euclidian, 1:cosine similarity.', default=0)
    parser.add_argument('--use_flipped_images', 
        help='Concatenates embeddings for the image and its horizontally flipped counterpart.', action='store_true')
    parser.add_argument('--subtract_mean', 
        help='Subtract feature mean before calculating distance.', action='store_true')
    parser.add_argument('--use_fixed_image_standardization', 
        help='Performs fixed standardization of images.', action='store_true')
    return parser.parse_args(argv)

if __name__ == '__main__':
    main(parse_arguments(sys.argv[1:]))
  • 首先加载data/pairs.txt文件,该文件保存着测试使用的图片,其中有同一个人,以及不同人的图片对;
  • 创建一个对象,使用TF的队列机制加载数据;
  • 加载facenet模型;
  • 启动QueueRunner,计算测试图片对的距离,根据距离(距离小于1为同一个人,否则相反)和实际标签来进行评估准确率;
    人脸检测MTCNN和人脸识别Facenet(二)_第15张图片
    呃,先来看一下args.lfw_pairs传入的”data/paris.txt”是个什么东西吧。pairs.txt里每行的两个数字代表什么意思?有两种情况,第一种是每行只有3个字符串(Abel_Pacheco 1 4),则第一个字符串就是文件夹名,也就是人名,第二个和第三个数字和第一个字符串分别组成该文件夹下的两张图片名。这两张图片是同一个人脸,所以,用issame=True标记它。而第二种情况是每行4个字符串(Robert_Downey_Jr 1 Tommy_Shane_Steiner 1),第一个和第三个字符串是两个不同的人名,第二个和第四个数字则分别对应这两个人名对应的文件夹下的图片。

6、在LFW数据集上使用已有模型

在实际应用过程中,我们有时候还会关心如何在自己的图像上应用已有模型。下面我们以计算人脸之间的距离为例,演示如何将模型应用到自己的数据上。

假设我们现在有三张图片,我们把他们存放在facenet/src目录下,文件分别叫做img1.jpg,img2.jpg,img3.jpg。这三张图像中各包含有一个人的人脸,我们希望计算它们两两之间的距离。使用facenet/src/compare.py文件来实现。
人脸检测MTCNN和人脸识别Facenet(二)_第16张图片
我们打开Anaconda Prompt,来到facenet路径下(注意这里是facenet路径下),运行如下命令:

python src/compare.py src/models/20180402-114759 src/img1.jpg  src/img2.jpg src/img3.jpg

但是结果显示0 successful operations,0 derived errors ignored。
人脸检测MTCNN和人脸识别Facenet(二)_第17张图片
由于安装的是tensorflow-gpu1.14,猜想是没有指定占用GPU比例,试着在原有命令中加上–gpu_memory_fraction=0.25,即

python src/compare.py src/models/20180402-114759 src/img1.jpg  src/img2.jpg src/img3.jpg --gpu_memory_fraction=0.25

运行结果如下图
人脸检测MTCNN和人脸识别Facenet(二)_第18张图片
我们尝试使用不同的三个人的图片进行测试:
人脸检测MTCNN和人脸识别Facenet(二)_第19张图片
运行如下命令

python src/compare.py src/models/20180402-114759 src/img3.jpg  src/img4.jpg src/img5.jpg --gpu_memory_fraction=0.25

结果如图所示
人脸检测MTCNN和人脸识别Facenet(二)_第20张图片
我们会发现同一个人的图片,测试得到的距离值偏小,而不同的人测试得到的距离偏大。正常情况下同一个人测得距离应该小于1,不同人测得距离应该大于1。然而上面的结果却不是这样,我认为这多半与我们选取的照片有关。在选取测试照片时,我们尽量要选取脸部较为清晰并且端正的图片,并且要与训练数据具有相同分布的图片,即此处尽量选取一些外国人的图片进行测试。
人脸检测MTCNN和人脸识别Facenet(二)_第21张图片

运行命令python src/compare.py src/models/20180402-114759 src/img6.jpg src/img7.jpg src/img8.jpg --gpu_memory_fraction=0.25,结果如图
人脸检测MTCNN和人脸识别Facenet(二)_第22张图片
我们可以看到这个效果还是不错的。因此,如果我们想在我们华人图片上也取得不错的效果,我们需要用华人的数据集进行训练模型。

7、重新训练新模型

从头训练一个新模型需要非常多的数据集,这里我们以CASIA-WebFace为例,这个 dataset 在原始地址已经下载不到了,而且这个 dataset 据说有很多无效的图片,所以这里我们使用的是清理过的数据库。该数据库可以在百度网盘中下载:下载地址,提取密码为 3zbb。

这个数据库有10575个类别494414张图像,每个类别都有各自的文件夹,里面有同一个人的几张或者几十张不等的脸部图片。我们先利用MTCNN 从这些照片中把人物的脸框出来,然后交给下面的 Facenet 去训练。

下载好之后,解压到datasets/casia/raw目录下,如图:
人脸检测MTCNN和人脸识别Facenet(二)_第23张图片
其中每个文件夹代表一个人,文件夹保存这个人的所有人脸图片。与LFW数据集类似,我们先利用MTCNN对原始图像进行人脸检测和对齐,我们打开Anaconda Prompt,来到facenet路径下,运行如下命令:

python  src/align/align_dataset_mtcnn.py   datasets/casia/raw  datasets/casia/casia_mtcnnpy_182 --image_size 182 --margin 44 --random_order --gpu_memory_fraction=0.25

对齐后的图像保存在路径datasets/casia/casia_mtcnnpy_182下,每张图像的大小都是182×182。而最终网络的输入是160×160,之所以先生成182×182的图像,是为了留出一定的空间给数据增强的裁切环节。我们会在182×182的图像上随机裁切出160×160的区域,再送入神经网络进行训练。

使用如下命令开始训练:

python src/train_softmax.py --logs_base_dir ./logs  --models_base_dir  ./models  --data_dir datasets/casia/casia_mtcnnpy_182  --image_size 160 --model_def models.inception_resnet_v1 --lfw_dir datasets/lfw/lfw_mtcnnpy_160 --optimizer RMSPROP --learning_rate -1 --max_nrof_epochs 80 --keep_probability 0.8  --random_crop --random_flip --learning_rate_schedule_file data/learning_rate_schedule_classifier_casia.txt  --weight_decay 5e-5 --center_loss_factor 1e-2 --center_loss_alfa 0.9

上面命令中有很多参数,我们来一一介绍。首先是文件src/train_softmax.py文件,它采用中心损失和softmax损失结合来训练模型,其中参数如下:

  • –logs_base_dir./logs:将会把训练日志保存到./logs中,在运行时,会在./logs文件夹下新建一个以当前时间命名的文讲夹。最终的日志会保存在这个文件夹中,所谓的日志文件,实际上指的是tf中的events文件,它主要包含当前损失、当前训练步数、当前学习率等信息。后面我们会使用TensorBoard查看这些信息;
  • –models_base_dir ./models:最终训练好的模型保存在./models文件夹下,在运行时,会在./models文件夹下新建一个以当前时间命名的文讲夹,并用来保存训练好的模型;
  • –data_dir …/datasets/casis/casia_maxpy_mtcnnpy_182:指定训练所使用的数据集的路径,这里使用的就是刚才对齐好的CASIA-WebFace人脸数据;
  • –image_size 160:输入网络的图片尺寸是160×160大小;
  • –mode_def models.inception_resnet_v1:指定了训练所使用的卷积网络是inception_resnet_v1网络。项目所支持的网络在src/models目录下,包含inception_resnet_v1,inception_resnet_v2和squeezenet三个模型,前两个模型较大,最后一个模型较小。如果在训练时出现内存或者显存不足的情况可以尝试使用sequeezenet网络,也可以修改batch_size
    大小为32或者64(默认是90);
  • –lfw_dir …/datasets/lfw/lfw_mtcnnpy_160:指定了LFW数据集的路径。如果指定了这个参数,那么每训练完一个epoch,就会在LFW数据集上执行一次测试,并将测试的准确率写入到日志文件中;
  • –optimizer RMSPROP :指定训练使用的优化方法;
  • –learning_rate -1:指定学习率,指定了负数表示忽略这个参数,而使用后面的–learning_rate_schedule_file参数规划学习率;
  • –max_nrof_epochs 80:指定训练轮数epoch;
  • –keep_probability 0.8:指定弃权的神经元保留率;
  • –random_crop:表明在数据增强时使用随机裁切;
  • –random_flip :表明在数据增强时使用随机翻转;
  • –learning_rate_schedule_file data/learning_rate_schedule_classifier_casia.txt:在之前指定了–learning_rate
    -1,因此最终的学习率将由参数–learning_rate_schedule_file决定。这个参数指定一个文件data/learning_rate_schedule_classifier_casia.txt,该文件内容如下:
    在这里插入图片描述
  • -weight_decay 5e-5:正则化系数;
  • –center_loss_factor 1e-2 :中心损失和Softmax损失的平衡系数;
  • –center_loss_alfa 0.9:中心损失的内部参数;

除了上面我们使用到的参数,还有许多参数,下面介绍一些比较重要的:

  • pretrained_model :models/20180408-102900
    预训练模型,使用预训练模型可以加快训练速度(微调时经常使用到);
  • batch_size:batch大小,越大,需要的内存也会越大;
  • random_rotate:表明在数据增强时使用随机旋转

由于CASIA-WebFace数据集比较大、训练起来周期较长,下面我们使用CASIA-WebFace一部分数据进行训练,运行结果如下:
在这里插入图片描述
解决方法:
1)需要指定GPU,代码头部添加如下代码:
import os
os.environ[“CUDA_VISIBLE_DEVICES”] = “1”
2)限制当前脚本可用显存,代码头部添加第一行,session 语句进行如第二行的修改
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
修改之后,继续运行,结果如下:
人脸检测MTCNN和人脸识别Facenet(二)_第24张图片
在这里插入图片描述
人脸检测MTCNN和人脸识别Facenet(二)_第25张图片
其中Epoch:[1][683/1000]表示当前为第1个epoch以及当前epoch内的第683个训练batch,程序中默认参数epoch_size为1000,表示一个epoch有1000个batch。Time表示这一步的消耗的时间,Lr是学习率,Loss为当前batch的损失,Xent是softmax损失,RegLoss是正则化损失和中心损失之和,Cl是中心损失(注意这里的损失都是平均损失,即当前batch损失和/batch_size);
最后还是有错,
人脸检测MTCNN和人脸识别Facenet(二)_第26张图片
python3中,iteritems变成items
生成日志文件和模型文件:
人脸检测MTCNN和人脸识别Facenet(二)_第27张图片
人脸检测MTCNN和人脸识别Facenet(二)_第28张图片
我们启动Anaconda Prompt,首先来到日志文件的上级路径下,这一步是必须的,然后输入如下命令:
tensorboard –logdir D:\program\facenet\logs\20200930-082803
人脸检测MTCNN和人脸识别Facenet(二)_第29张图片
接着打开浏览器,输入http://XiaRedMiG:6006,这里XiaRedMiG是本机地址,6006是端口号。打开后,单击SCALARS,我们会看到我们在程序中创建的变量total_loss_1,点击它,会显示如下内容:
人脸检测MTCNN和人脸识别Facenet(二)_第30张图片
上图为训练过程中损失函数的变化过程,横坐标为迭代步数,这里为33k左右,主要是因为我迭代了33个epoch后终止了程序,每个epoch又迭代1000个batch。
与之对应的,每个epoch结束还会在LFW数据集上做一次验证,对应的准确率变化曲线如下:
人脸检测MTCNN和人脸识别Facenet(二)_第31张图片
在左侧有个smoothing滚动条,可以用来改变右侧标量的曲线,我们还可以勾选上show data download links,然后下载数据。
train_softmax.py源码如下:

"""Training a face recognizer with TensorFlow using softmax cross entropy loss
"""
# MIT License
# 
# Copyright (c) 2016 David Sandberg
# 
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# 
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# 
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from datetime import datetime
import os.path
import time
import sys
import random
import tensorflow as tf
import numpy as np
import importlib
import argparse
import facenet
import lfw
import h5py
import math
import tensorflow.contrib.slim as slim
from tensorflow.python.ops import data_flow_ops
from tensorflow.python.framework import ops
from tensorflow.python.ops import array_ops

def main(args):
    #导入CNN网络模块
    network = importlib.import_module(args.model_def)
    image_size = (args.image_size, args.image_size)
    
    #当前时间
    subdir = datetime.strftime(datetime.now(), '%Y%m%d-%H%M%S')
    #日志文件夹路径
    log_dir = os.path.join(os.path.expanduser(args.logs_base_dir), subdir)
    if not os.path.isdir(log_dir):  # Create the log directory if it doesn't exist
        os.makedirs(log_dir)
    #模型文件夹路径
    model_dir = os.path.join(os.path.expanduser(args.models_base_dir), subdir)
    if not os.path.isdir(model_dir):  # Create the model directory if it doesn't exist
        os.makedirs(model_dir)

    stat_file_name = os.path.join(log_dir, 'stat.h5')

    # Write arguments to a text file 
    facenet.write_arguments_to_file(args, os.path.join(log_dir, 'arguments.txt'))
        
    # Store some git revision info in a text file in the log directory
    src_path,_ = os.path.split(os.path.realpath(__file__))
    facenet.store_revision_info(src_path, log_dir, ' '.join(sys.argv))

    np.random.seed(seed=args.seed)
    random.seed(args.seed)
    #训练数据集准备工作:获取每个类别名称以及该类别下所有图片的绝对路径
    dataset = facenet.get_dataset(args.data_dir)
    if args.filter_filename:
        dataset = filter_dataset(dataset, os.path.expanduser(args.filter_filename), 
            args.filter_percentile, args.filter_min_nrof_images_per_class)
        
    if args.validation_set_split_ratio>0.0:
        train_set, val_set = facenet.split_dataset(dataset, args.validation_set_split_ratio, args.min_nrof_val_images_per_class, 'SPLIT_IMAGES')
    else:
        train_set, val_set = dataset, []
        
    #类别个数 每一个人都是一个类别
    nrof_classes = len(train_set)
    
    print('Model directory: %s' % model_dir)
    print('Log directory: %s' % log_dir)
    #指定了预训练模型?
    pretrained_model = None
    if args.pretrained_model:
        pretrained_model = os.path.expanduser(args.pretrained_model)
        print('Pre-trained model: %s' % pretrained_model)
    #指定了lfw数据集路径?用于测试
    if args.lfw_dir:
        print('LFW directory: %s' % args.lfw_dir)
        # Read the file containing the pairs used for testing
        pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs))
        # Get the paths for the corresponding images
        lfw_paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs)
    
    with tf.Graph().as_default():
        tf.set_random_seed(args.seed)
        #训练步数
        global_step = tf.Variable(0, trainable=False)
        
        # Get a list of image paths and their labels
        # image_list:list 每一个元素对应一个图像的路径 
        # label_list:list 每一个元素对应一个图像的标签 使用0,1,2...表示
        image_list, label_list = facenet.get_image_paths_and_labels(train_set)
        assert len(image_list)>0, 'The training set should not be empty'
        
        val_image_list, val_label_list = facenet.get_image_paths_and_labels(val_set)

        # Create a queue that produces indices into the image_list and label_list 
        labels = ops.convert_to_tensor(label_list, dtype=tf.int32)
        #图像个数
        range_size = array_ops.shape(labels)[0]
        #创建一个索引队列,队列产生0到range_size-1的元素
        index_queue = tf.train.range_input_producer(range_size, num_epochs=None,
                             shuffle=True, seed=None, capacity=32)  
        #每次出队args.batch_size*args.epoch_size个元素 即一个epoch样本数
        index_dequeue_op = index_queue.dequeue_many(args.batch_size*args.epoch_size, 'index_dequeue')
        
        #定义占位符  
        learning_rate_placeholder = tf.placeholder(tf.float32, name='learning_rate')
        batch_size_placeholder = tf.placeholder(tf.int32, name='batch_size')
        phase_train_placeholder = tf.placeholder(tf.bool, name='phase_train')
        image_paths_placeholder = tf.placeholder(tf.string, shape=(None,1), name='image_paths')
        labels_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='labels')
        control_placeholder = tf.placeholder(tf.int32, shape=(None,1), name='control')
        
        #新建一个队列,数据流操作,fifo,队列中每一项包含一个输入图像路径和相应的标签、control  shapes:对应的是每一项输入的形状
        nrof_preprocess_threads = 4
        input_queue = data_flow_ops.FIFOQueue(capacity=2000000,
                                    dtypes=[tf.string, tf.int32, tf.int32],
                                    shapes=[(1,), (1,), (1,)],
                                    shared_name=None, name=None)
        #返回一个入队操作
        enqueue_op = input_queue.enqueue_many([image_paths_placeholder, labels_placeholder, control_placeholder], name='enqueue_op')
        #返回一个出队操作,即每次训练获取batch大小的数据
        image_batch, label_batch = facenet.create_input_pipeline(input_queue, image_size, nrof_preprocess_threads, batch_size_placeholder)

        #复制副本
        image_batch = tf.identity(image_batch, 'image_batch')
        image_batch = tf.identity(image_batch, 'input')
        label_batch = tf.identity(label_batch, 'label_batch')
        
        print('Number of classes in training set: %d' % nrof_classes)
        print('Number of examples in training set: %d' % len(image_list))

        print('Number of classes in validation set: %d' % len(val_set))
        print('Number of examples in validation set: %d' % len(val_image_list))
        
        print('Building training graph')
        
        # Build the inference graph 
        #创建CNN网络,最后一层输出 prelogits:[batch_size,128]
        prelogits, _ = network.inference(image_batch, args.keep_probability, 
            phase_train=phase_train_placeholder, bottleneck_layer_size=args.embedding_size, 
            weight_decay=args.weight_decay)
        #输出每个类别的概率 [batch_size,人数]
        logits = slim.fully_connected(prelogits, len(train_set), activation_fn=None, 
                weights_initializer=slim.initializers.xavier_initializer(), 
                weights_regularizer=slim.l2_regularizer(args.weight_decay),
                scope='Logits', reuse=False)
        
        #先计算每一行的l2范数,然后对每一行的元素/该行范数
        embeddings = tf.nn.l2_normalize(prelogits, 1, 1e-10, name='embeddings')

        # Norm for the prelogits
        eps = 1e-4
        #默认prelogits先求绝对值,然后沿axis=1求1范数,最后求平均
        prelogits_norm = tf.reduce_mean(tf.norm(tf.abs(prelogits)+eps, ord=args.prelogits_norm_p, axis=1))
        #把变量prelogits_norm * args.prelogits_norm_loss_factor放入tf.GraphKeys.REGULARIZATION_LOSSES集合
        tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, prelogits_norm * args.prelogits_norm_loss_factor)

        # Add center loss 计算中心损失,并追加到tf.GraphKeys.REGULARIZATION_LOSSES集合
        prelogits_center_loss, _ = facenet.center_loss(prelogits, label_batch, args.center_loss_alfa, nrof_classes)
        tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, prelogits_center_loss * args.center_loss_factor)

        #学习率指数衰减
        learning_rate = tf.train.exponential_decay(learning_rate_placeholder, global_step,
            args.learning_rate_decay_epochs*args.epoch_size, args.learning_rate_decay_factor, staircase=True)
        tf.summary.scalar('learning_rate', learning_rate)

        # Calculate the average cross entropy loss across the batch
        cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
            labels=label_batch, logits=logits, name='cross_entropy_per_example')
        cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')
        #加入交叉熵代价函数
        tf.add_to_collection('losses', cross_entropy_mean)
        
        #计算准确率 correct_prediction:[batch_size,1]
        correct_prediction = tf.cast(tf.equal(tf.argmax(logits, 1), tf.cast(label_batch, tf.int64)), tf.float32)
        accuracy = tf.reduce_mean(correct_prediction)
        
        # Calculate the total losses  https://blog.csdn.net/uestc_c2_403/article/details/72415791
        regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
        total_loss = tf.add_n([cross_entropy_mean] + regularization_losses, name='total_loss')

        # Build a Graph that trains the model with one batch of examples and updates the model parameters
        train_op = facenet.train(total_loss, global_step, args.optimizer, 
            learning_rate, args.moving_average_decay, tf.global_variables(), args.log_histograms)
        
        # Create a saver
        saver = tf.train.Saver(tf.trainable_variables(), max_to_keep=3)

        # Build the summary operation based on the TF collection of Summaries.
        summary_op = tf.summary.merge_all()

        # Start running operations on the Graph.
        gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction)
        sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False))
        sess.run(tf.global_variables_initializer())
        sess.run(tf.local_variables_initializer())
        summary_writer = tf.summary.FileWriter(log_dir, sess.graph)
        #创建一个协调器,管理线程
        coord = tf.train.Coordinator()
        #启动start_queue_runners之后, 才会开始填充文件队列、并读取数据
        tf.train.start_queue_runners(coord=coord, sess=sess)

        #开始执行图
        with sess.as_default():
            #加载预训练模型
            if pretrained_model:
                print('Restoring pretrained model: %s' % pretrained_model)
                saver.restore(sess, pretrained_model)

            # Training and validation loop
            print('Running training')
            nrof_steps = args.max_nrof_epochs*args.epoch_size
            nrof_val_samples = int(math.ceil(args.max_nrof_epochs / args.validate_every_n_epochs))   # Validate every validate_every_n_epochs as well as in the last epoch
            stat = {
                'loss': np.zeros((nrof_steps,), np.float32),
                'center_loss': np.zeros((nrof_steps,), np.float32),
                'reg_loss': np.zeros((nrof_steps,), np.float32),
                'xent_loss': np.zeros((nrof_steps,), np.float32),
                'prelogits_norm': np.zeros((nrof_steps,), np.float32),
                'accuracy': np.zeros((nrof_steps,), np.float32),
                'val_loss': np.zeros((nrof_val_samples,), np.float32),
                'val_xent_loss': np.zeros((nrof_val_samples,), np.float32),
                'val_accuracy': np.zeros((nrof_val_samples,), np.float32),
                'lfw_accuracy': np.zeros((args.max_nrof_epochs,), np.float32),
                'lfw_valrate': np.zeros((args.max_nrof_epochs,), np.float32),
                'learning_rate': np.zeros((args.max_nrof_epochs,), np.float32),
                'time_train': np.zeros((args.max_nrof_epochs,), np.float32),
                'time_validate': np.zeros((args.max_nrof_epochs,), np.float32),
                'time_evaluate': np.zeros((args.max_nrof_epochs,), np.float32),
                'prelogits_hist': np.zeros((args.max_nrof_epochs, 1000), np.float32),
              }
            #开始迭代 epochs轮
            for epoch in range(1,args.max_nrof_epochs+1):
                step = sess.run(global_step, feed_dict=None)
                # Train for one epoch
                t = time.time()
                cont = train(args, sess, epoch, image_list, label_list, index_dequeue_op, enqueue_op, image_paths_placeholder, labels_placeholder,
                    learning_rate_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder, global_step, 
                    total_loss, train_op, summary_op, summary_writer, regularization_losses, args.learning_rate_schedule_file,
                    stat, cross_entropy_mean, accuracy, learning_rate,
                    prelogits, prelogits_center_loss, args.random_rotate, args.random_crop, args.random_flip, prelogits_norm, args.prelogits_hist_max, args.use_fixed_image_standardization)
                stat['time_train'][epoch-1] = time.time() - t
                
                if not cont:
                    break
                  
                t = time.time()
                if len(val_image_list)>0 and ((epoch-1) % args.validate_every_n_epochs == args.validate_every_n_epochs-1 or epoch==args.max_nrof_epochs):
                    validate(args, sess, epoch, val_image_list, val_label_list, enqueue_op, image_paths_placeholder, labels_placeholder, control_placeholder,
                        phase_train_placeholder, batch_size_placeholder, 
                        stat, total_loss, regularization_losses, cross_entropy_mean, accuracy, args.validate_every_n_epochs, args.use_fixed_image_standardization)
                stat['time_validate'][epoch-1] = time.time() - t

                # Save variables and the metagraph if it doesn't exist already
                save_variables_and_metagraph(sess, saver, summary_writer, model_dir, subdir, epoch)

                # Evaluate on LFW
                t = time.time()
                if args.lfw_dir:
                    evaluate(sess, enqueue_op, image_paths_placeholder, labels_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder, 
                        embeddings, label_batch, lfw_paths, actual_issame, args.lfw_batch_size, args.lfw_nrof_folds, log_dir, step, summary_writer, stat, epoch, 
                        args.lfw_distance_metric, args.lfw_subtract_mean, args.lfw_use_flipped_images, args.use_fixed_image_standardization)
                stat['time_evaluate'][epoch-1] = time.time() - t

                print('Saving statistics')
                with h5py.File(stat_file_name, 'w') as f:
                    for key, value in stat.items():
                        f.create_dataset(key, data=value)
    
    return model_dir
  
def find_threshold(var, percentile):
    hist, bin_edges = np.histogram(var, 100)
    cdf = np.float32(np.cumsum(hist)) / np.sum(hist)
    bin_centers = (bin_edges[:-1]+bin_edges[1:])/2
    #plt.plot(bin_centers, cdf)
    threshold = np.interp(percentile*0.01, cdf, bin_centers)
    return threshold
  
def filter_dataset(dataset, data_filename, percentile, min_nrof_images_per_class):
    with h5py.File(data_filename,'r') as f:
        distance_to_center = np.array(f.get('distance_to_center'))
        label_list = np.array(f.get('label_list'))
        image_list = np.array(f.get('image_list'))
        distance_to_center_threshold = find_threshold(distance_to_center, percentile)
        indices = np.where(distance_to_center>=distance_to_center_threshold)[0]
        filtered_dataset = dataset
        removelist = []
        for i in indices:
            label = label_list[i]
            image = image_list[i]
            if image in filtered_dataset[label].image_paths:
                filtered_dataset[label].image_paths.remove(image)
            if len(filtered_dataset[label].image_paths)<min_nrof_images_per_class:
                removelist.append(label)

        ix = sorted(list(set(removelist)), reverse=True)
        for i in ix:
            del(filtered_dataset[i])

    return filtered_dataset
  
def train(args, sess, epoch, image_list, label_list, index_dequeue_op, enqueue_op, image_paths_placeholder, labels_placeholder, 
      learning_rate_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder, step, 
      loss, train_op, summary_op, summary_writer, reg_losses, learning_rate_schedule_file, 
      stat, cross_entropy_mean, accuracy, 
      learning_rate, prelogits, prelogits_center_loss, random_rotate, random_crop, random_flip, prelogits_norm, prelogits_hist_max, use_fixed_image_standardization):
    batch_number = 0
    
    if args.learning_rate>0.0:
        lr = args.learning_rate
    else:
        lr = facenet.get_learning_rate_from_file(learning_rate_schedule_file, epoch)
        
    if lr<=0:
        return False 

    #一个epoch,batch_size*epoch_size个样本
    index_epoch = sess.run(index_dequeue_op)
    label_epoch = np.array(label_list)[index_epoch]
    image_epoch = np.array(image_list)[index_epoch]
    
    # Enqueue one epoch of image paths and labels
    labels_array = np.expand_dims(np.array(label_epoch),1)
    image_paths_array = np.expand_dims(np.array(image_epoch),1)
    control_value = facenet.RANDOM_ROTATE * random_rotate + facenet.RANDOM_CROP * random_crop + facenet.RANDOM_FLIP * random_flip + facenet.FIXED_STANDARDIZATION * use_fixed_image_standardization
    control_array = np.ones_like(labels_array) * control_value
    sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array})

    # Training loop  一个epoch
    train_time = 0
    while batch_number < args.epoch_size:
        start_time = time.time()
        feed_dict = {learning_rate_placeholder: lr, phase_train_placeholder:True, batch_size_placeholder:args.batch_size}
        tensor_list = [loss, train_op, step, reg_losses, prelogits, cross_entropy_mean, learning_rate, prelogits_norm, accuracy, prelogits_center_loss]
        if batch_number % 100 == 0:
            loss_, _, step_, reg_losses_, prelogits_, cross_entropy_mean_, lr_, prelogits_norm_, accuracy_, center_loss_, summary_str = sess.run(tensor_list + [summary_op], feed_dict=feed_dict)
            summary_writer.add_summary(summary_str, global_step=step_)
        else:
            loss_, _, step_, reg_losses_, prelogits_, cross_entropy_mean_, lr_, prelogits_norm_, accuracy_, center_loss_ = sess.run(tensor_list, feed_dict=feed_dict)
         
        duration = time.time() - start_time
        stat['loss'][step_-1] = loss_
        stat['center_loss'][step_-1] = center_loss_
        stat['reg_loss'][step_-1] = np.sum(reg_losses_)
        stat['xent_loss'][step_-1] = cross_entropy_mean_
        stat['prelogits_norm'][step_-1] = prelogits_norm_
        stat['learning_rate'][epoch-1] = lr_
        stat['accuracy'][step_-1] = accuracy_
        stat['prelogits_hist'][epoch-1,:] += np.histogram(np.minimum(np.abs(prelogits_), prelogits_hist_max), bins=1000, range=(0.0, prelogits_hist_max))[0]
        
        duration = time.time() - start_time
        print('Epoch: [%d][%d/%d]\tTime %.3f\tLoss %2.3f\tXent %2.3f\tRegLoss %2.3f\tAccuracy %2.3f\tLr %2.5f\tCl %2.3f' %
              (epoch, batch_number+1, args.epoch_size, duration, loss_, cross_entropy_mean_, np.sum(reg_losses_), accuracy_, lr_, center_loss_))
        batch_number += 1
        train_time += duration
    # Add validation loss and accuracy to summary
    summary = tf.Summary()
    #pylint: disable=maybe-no-member
    summary.value.add(tag='time/total', simple_value=train_time)
    summary_writer.add_summary(summary, global_step=step_)
    return True

def validate(args, sess, epoch, image_list, label_list, enqueue_op, image_paths_placeholder, labels_placeholder, control_placeholder,
             phase_train_placeholder, batch_size_placeholder, 
             stat, loss, regularization_losses, cross_entropy_mean, accuracy, validate_every_n_epochs, use_fixed_image_standardization):
  
    print('Running forward pass on validation set')

    nrof_batches = len(label_list) // args.lfw_batch_size
    nrof_images = nrof_batches * args.lfw_batch_size
    
    # Enqueue one epoch of image paths and labels
    labels_array = np.expand_dims(np.array(label_list[:nrof_images]),1)
    image_paths_array = np.expand_dims(np.array(image_list[:nrof_images]),1)
    control_array = np.ones_like(labels_array, np.int32)*facenet.FIXED_STANDARDIZATION * use_fixed_image_standardization
    sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array})

    loss_array = np.zeros((nrof_batches,), np.float32)
    xent_array = np.zeros((nrof_batches,), np.float32)
    accuracy_array = np.zeros((nrof_batches,), np.float32)

    # Training loop
    start_time = time.time()
    for i in range(nrof_batches):
        feed_dict = {phase_train_placeholder:False, batch_size_placeholder:args.lfw_batch_size}
        loss_, cross_entropy_mean_, accuracy_ = sess.run([loss, cross_entropy_mean, accuracy], feed_dict=feed_dict)
        loss_array[i], xent_array[i], accuracy_array[i] = (loss_, cross_entropy_mean_, accuracy_)
        if i % 10 == 9:
            print('.', end='')
            sys.stdout.flush()
    print('')

    duration = time.time() - start_time

    val_index = (epoch-1)//validate_every_n_epochs
    stat['val_loss'][val_index] = np.mean(loss_array)
    stat['val_xent_loss'][val_index] = np.mean(xent_array)
    stat['val_accuracy'][val_index] = np.mean(accuracy_array)

    print('Validation Epoch: %d\tTime %.3f\tLoss %2.3f\tXent %2.3f\tAccuracy %2.3f' %
          (epoch, duration, np.mean(loss_array), np.mean(xent_array), np.mean(accuracy_array)))


def evaluate(sess, enqueue_op, image_paths_placeholder, labels_placeholder, phase_train_placeholder, batch_size_placeholder, control_placeholder, 
        embeddings, labels, image_paths, actual_issame, batch_size, nrof_folds, log_dir, step, summary_writer, stat, epoch, distance_metric, subtract_mean, use_flipped_images, use_fixed_image_standardization):
    start_time = time.time()
    # Run forward pass to calculate embeddings
    print('Runnning forward pass on LFW images')
    
    # Enqueue one epoch of image paths and labels
    nrof_embeddings = len(actual_issame)*2  # nrof_pairs * nrof_images_per_pair
    nrof_flips = 2 if use_flipped_images else 1
    nrof_images = nrof_embeddings * nrof_flips
    labels_array = np.expand_dims(np.arange(0,nrof_images),1)
    image_paths_array = np.expand_dims(np.repeat(np.array(image_paths),nrof_flips),1)
    control_array = np.zeros_like(labels_array, np.int32)
    if use_fixed_image_standardization:
        control_array += np.ones_like(labels_array)*facenet.FIXED_STANDARDIZATION
    if use_flipped_images:
        # Flip every second image
        control_array += (labels_array % 2)*facenet.FLIP
    sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array, control_placeholder: control_array})
    
    embedding_size = int(embeddings.get_shape()[1])
    assert nrof_images % batch_size == 0, 'The number of LFW images must be an integer multiple of the LFW batch size'
    nrof_batches = nrof_images // batch_size
    emb_array = np.zeros((nrof_images, embedding_size))
    lab_array = np.zeros((nrof_images,))
    for i in range(nrof_batches):
        feed_dict = {phase_train_placeholder:False, batch_size_placeholder:batch_size}
        emb, lab = sess.run([embeddings, labels], feed_dict=feed_dict)
        lab_array[lab] = lab
        emb_array[lab, :] = emb
        if i % 10 == 9:
            print('.', end='')
            sys.stdout.flush()
    print('')
    embeddings = np.zeros((nrof_embeddings, embedding_size*nrof_flips))
    if use_flipped_images:
        # Concatenate embeddings for flipped and non flipped version of the images
        embeddings[:,:embedding_size] = emb_array[0::2,:]
        embeddings[:,embedding_size:] = emb_array[1::2,:]
    else:
        embeddings = emb_array

    assert np.array_equal(lab_array, np.arange(nrof_images))==True, 'Wrong labels used for evaluation, possibly caused by training examples left in the input pipeline'
    _, _, accuracy, val, val_std, far = lfw.evaluate(embeddings, actual_issame, nrof_folds=nrof_folds, distance_metric=distance_metric, subtract_mean=subtract_mean)
    
    print('Accuracy: %2.5f+-%2.5f' % (np.mean(accuracy), np.std(accuracy)))
    print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far))
    lfw_time = time.time() - start_time
    # Add validation loss and accuracy to summary
    summary = tf.Summary()
    #pylint: disable=maybe-no-member
    summary.value.add(tag='lfw/accuracy', simple_value=np.mean(accuracy))
    summary.value.add(tag='lfw/val_rate', simple_value=val)
    summary.value.add(tag='time/lfw', simple_value=lfw_time)
    summary_writer.add_summary(summary, step)
    with open(os.path.join(log_dir,'lfw_result.txt'),'at') as f:
        f.write('%d\t%.5f\t%.5f\n' % (step, np.mean(accuracy), val))
    stat['lfw_accuracy'][epoch-1] = np.mean(accuracy)
    stat['lfw_valrate'][epoch-1] = val

def save_variables_and_metagraph(sess, saver, summary_writer, model_dir, model_name, step):
    # Save the model checkpoint
    print('Saving variables')
    start_time = time.time()
    checkpoint_path = os.path.join(model_dir, 'model-%s.ckpt' % model_name)
    saver.save(sess, checkpoint_path, global_step=step, write_meta_graph=False)
    save_time_variables = time.time() - start_time
    print('Variables saved in %.2f seconds' % save_time_variables)
    metagraph_filename = os.path.join(model_dir, 'model-%s.meta' % model_name)
    save_time_metagraph = 0  
    if not os.path.exists(metagraph_filename):
        print('Saving metagraph')
        start_time = time.time()
        saver.export_meta_graph(metagraph_filename)
        save_time_metagraph = time.time() - start_time
        print('Metagraph saved in %.2f seconds' % save_time_metagraph)
    summary = tf.Summary()
    #pylint: disable=maybe-no-member
    summary.value.add(tag='time/save_variables', simple_value=save_time_variables)
    summary.value.add(tag='time/save_metagraph', simple_value=save_time_metagraph)
    summary_writer.add_summary(summary, step)
  

def parse_arguments(argv):
    '''
    参数解析
    '''
    parser = argparse.ArgumentParser()
    
    #日志文件保存路径
    parser.add_argument('--logs_base_dir', type=str, 
        help='Directory where to write event logs.', default='~/logs/facenet')
    #模型文件保存路径
    parser.add_argument('--models_base_dir', type=str,
        help='Directory where to write trained models and checkpoints.', default='~/models/facenet')
    #GOU内存分配指定大小(百分比)
    parser.add_argument('--gpu_memory_fraction', type=float,
        help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0)
    #加载预训练模型
    parser.add_argument('--pretrained_model', type=str,
        help='Load a pretrained model before training starts.')
    #经过MTCNN对齐和人脸检测后的数据存放路径
    parser.add_argument('--data_dir', type=str,
        help='Path to the data directory containing aligned face patches.',
        default='~/datasets/casia/casia_maxpy_mtcnnalign_182_160')
    #指定网络结构
    parser.add_argument('--model_def', type=str,
        help='Model definition. Points to a module containing the definition of the inference graph.', default='models.inception_resnet_v1')
    #训练epoch数
    parser.add_argument('--max_nrof_epochs', type=int,
        help='Number of epochs to run.', default=500)
    #指定batch大小
    parser.add_argument('--batch_size', type=int,
        help='Number of images to process in a batch.', default=90)
    #指定图片大小
    parser.add_argument('--image_size', type=int,
        help='Image size (height, width) in pixels.', default=160)
    #每一个epoch的batches数量
    parser.add_argument('--epoch_size', type=int,
        help='Number of batches per epoch.', default=1000)
    #embedding的维度
    parser.add_argument('--embedding_size', type=int,
        help='Dimensionality of the embedding.', default=128)
    #随机裁切?
    parser.add_argument('--random_crop', 
        help='Performs random cropping of training images. If false, the center image_size pixels from the training images are used. ' +
         'If the size of the images in the data directory is equal to image_size no cropping is performed', action='store_true')
    #随即翻转
    parser.add_argument('--random_flip', 
        help='Performs random horizontal flipping of training images.', action='store_true')
    #随机旋转
    parser.add_argument('--random_rotate', 
        help='Performs random rotations of training images.', action='store_true')
    parser.add_argument('--use_fixed_image_standardization', 
        help='Performs fixed standardization of images.', action='store_true')
    #弃权系数
    parser.add_argument('--keep_probability', type=float,
        help='Keep probability of dropout for the fully connected layer(s).', default=1.0)
    #正则化系数
    parser.add_argument('--weight_decay', type=float,
        help='L2 weight regularization.', default=0.0)
    #中心损失和Softmax损失的平衡系数
    parser.add_argument('--center_loss_factor', type=float,
        help='Center loss factor.', default=0.0)
    #中心损失的内部参数
    parser.add_argument('--center_loss_alfa', type=float,
        help='Center update rate for center loss.', default=0.95)
    parser.add_argument('--prelogits_norm_loss_factor', type=float,
        help='Loss based on the norm of the activations in the prelogits layer.', default=0.0)
    parser.add_argument('--prelogits_norm_p', type=float,
        help='Norm to use for prelogits norm loss.', default=1.0)
    parser.add_argument('--prelogits_hist_max', type=float,
        help='The max value for the prelogits histogram.', default=10.0)
    #优化器
    parser.add_argument('--optimizer', type=str, choices=['ADAGRAD', 'ADADELTA', 'ADAM', 'RMSPROP', 'MOM'],
        help='The optimization algorithm to use', default='ADAGRAD')
    #学习率
    parser.add_argument('--learning_rate', type=float,
        help='Initial learning rate. If set to a negative value a learning rate ' +
        'schedule can be specified in the file "learning_rate_schedule.txt"', default=0.1)
    parser.add_argument('--learning_rate_decay_epochs', type=int,
        help='Number of epochs between learning rate decay.', default=100)
    parser.add_argument('--learning_rate_decay_factor', type=float,
        help='Learning rate decay factor.', default=1.0)
    parser.add_argument('--moving_average_decay', type=float,
        help='Exponential decay for tracking of training parameters.', default=0.9999)
    parser.add_argument('--seed', type=int,
        help='Random seed.', default=666)
    parser.add_argument('--nrof_preprocess_threads', type=int,
        help='Number of preprocessing (data loading and augmentation) threads.', default=4)
    parser.add_argument('--log_histograms', 
        help='Enables logging of weight/bias histograms in tensorboard.', action='store_true')
    parser.add_argument('--learning_rate_schedule_file', type=str,
        help='File containing the learning rate schedule that is used when learning_rate is set to to -1.', default='data/learning_rate_schedule.txt')
    parser.add_argument('--filter_filename', type=str,
        help='File containing image data used for dataset filtering', default='')
    parser.add_argument('--filter_percentile', type=float,
        help='Keep only the percentile images closed to its class center', default=100.0)
    parser.add_argument('--filter_min_nrof_images_per_class', type=int,
        help='Keep only the classes with this number of examples or more', default=0)
    parser.add_argument('--validate_every_n_epochs', type=int,
        help='Number of epoch between validation', default=5)
    parser.add_argument('--validation_set_split_ratio', type=float,
        help='The ratio of the total dataset to use for validation', default=0.0)
    parser.add_argument('--min_nrof_val_images_per_class', type=float,
        help='Classes with fewer images will be removed from the validation set', default=0)
 
    # Parameters for validation on LFW  检验参数
    parser.add_argument('--lfw_pairs', type=str,
        help='The file containing the pairs to use for validation.', default='data/pairs.txt')
    #lfw数据集经过MTCNN进行人脸检测和对齐后的数据路径
    parser.add_argument('--lfw_dir', type=str,
        help='Path to the data directory containing aligned face patches.', default='')
    parser.add_argument('--lfw_batch_size', type=int,
        help='Number of images to process in a batch in the LFW test set.', default=100)
    parser.add_argument('--lfw_nrof_folds', type=int,
        help='Number of folds to use for cross validation. Mainly used for testing.', default=10)
    parser.add_argument('--lfw_distance_metric', type=int,
        help='Type of distance metric to use. 0: Euclidian, 1:Cosine similarity distance.', default=0)
    parser.add_argument('--lfw_use_flipped_images', 
        help='Concatenates embeddings for the image and its horizontally flipped counterpart.', action='store_true')
    parser.add_argument('--lfw_subtract_mean', 
        help='Subtract feature mean before calculating distance.', action='store_true')
    return parser.parse_args(argv)
  

if __name__ == '__main__':
    main(parse_arguments(sys.argv[1:]))

创建CNN网络,用于提取人脸特征;
训练数据准备阶段,采用TF队列机制加载数据集;
定义CNN网络损失函数,L2正则化、中心损失函数、交叉熵代价函数(严格来说是softmax损失函数);
开始在数据集上训练、并在LFW上测试;

有时候,我们需要用自己的数据集对预训练好的模型进行重新训练,或者之前训练了一个模型之后,觉得训练轮数不够,又不想从头开始训练,这样,在训练之前就要把之前训练的模型重新加载进去,方式如下:
第一步:添加预训练模型的参数:
在train_tripletloss.py中找到这样一个语句:
在这里插入图片描述
改成这样:

parser.add_argument('--pretrained_model', type=str,
        help='Load a pretrained model before training starts.',default='模型所在路径')

第二步:解决程序中的一个小bug
如果只是完成了第一步,运行程序会报错。经过调试,是因为程序有一个小的bug需要修复:
找到这一行程序:
人脸检测MTCNN和人脸识别Facenet(二)_第32张图片
可以看出,这一处函数的作用是:如果预训练模型这个参数非空,那么用tensorflow的saver.restore()函数重新加载模型参数,但是此处会报错,
那么我们模仿compare.py函数中的加载模型方法,将这个函数改为:
facenet.load_model(args.pretrained_model)
然后运行程序,发现程序已经可以正常执行了。
如果不放心,可以取一个已经训练好的模型,加载之后训练一轮,会发现初始的损失函数非常小,同时,训练一轮之后模型的准确率已经和加载的预训练模型准确率差不多了,说明模型加载成功。

参考文章:
[1]Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

[2]官方代码

[3]其他代码实现(MXNet)

[4]21个项目玩转深度学习 何之源(部分内容来自于该书,第六章,GitHub网址)

[5]如何通过OpenFace实现人脸识别框架

[6]如何应用MTCNN和FaceNet模型实现人脸检测及识别(原理讲解还是比较细的)

[7]MTCNN 的 TensorFlow 实现

[8]人脸识别(Facenet)

[9]【数据库】FaceDataset常用的人脸数据库

[10]https://www.cnblogs.com/zyly/p/9703614.html#_label3

还有很多文章帮了大忙,但时间过了挺久,已经忘了链接,这里统一表示感谢!

你可能感兴趣的:(人脸识别,人脸识别,tensorflow,python)