【人脸识别】FaceNet(二)

源码:https://github.com/davidsandberg/facenet

本篇主要对facenet计算图构造部分和模型评估部分进行介绍
有关Triplet Loss及Triplet Selection请看FaceNet(一)—Triplet Loss

###inception_resnet_v1.py
【人脸识别】FaceNet(二)_第1张图片
以上为三个不同的网络模型,在此我们使用inception_resnet_v1,关于inception_resnet_v1的详细内容可参考https://blog.csdn.net/lovelyaiq/article/details/79026181

###train_tripletloss.py

#####构造计算图,prelogits为最后一层的输出

	# 导入模型,model_def的默认值为'models.inception_resnet_v1'
	network = importlib.import_module(args.model_def)
	
    #其中prelogits是最后一层的输出
    prelogits, _ = network.inference(image_batch, args.keep_probability, 
            phase_train=phase_train_placeholder, bottleneck_layer_size=args.embedding_size,
            weight_decay=args.weight_decay)
        # 对最后的输出进行标准化,得到该图像的embedding
        # embeddings = tf.nn.l2_normalize(输入向量, L2范化的维数(取0(列L2范化)或1(行L2范化)), 泛化的最小值边界, name='embeddings')
        embeddings = tf.nn.l2_normalize(prelogits, 1, 1e-10, name='embeddings')
        # Split embeddings into anchor, positive and negative and calculate triplet loss
        # 将输出的embeddings分为anchor,正样本, 负样本三个部分
        anchor, positive, negative = tf.unstack(tf.reshape(embeddings, [-1,3,args.embedding_size]), 3, 1)
        #根据上面三个部分计算triplet-loss
        triplet_loss = facenet.triplet_loss(anchor, positive, negative, args.alpha)
        #定义优化方法,将指数衰减应用到学习率上 
        #tf.train.exponential_decay(learning_rate, global_, decay_steps, decay_rate, staircase=True/False)
        ##如果staircase=True,那就表明每decay_steps次更新学习速率,如果是False,那就是每一步都更新学习速率。
        learning_rate = tf.train.exponential_decay(learning_rate_placeholder, global_step,
            args.learning_rate_decay_epochs*args.epoch_size, args.learning_rate_decay_factor, staircase=True)
        tf.summary.scalar('learning_rate', learning_rate)

        # Calculate the total losses
        # 加入正则化损失
        regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
        # 整体的损失即为triplet-loss+正则损失
        total_loss = tf.add_n([triplet_loss] + regularization_losses, name='total_loss')

        # Build a Graph that trains the model with one batch of examples and updates the model parameters
        # 用上述定义的优化方法和loss进行优化
        train_op = facenet.train(total_loss, global_step, args.optimizer, 
            learning_rate, args.moving_average_decay, tf.global_variables())

facenet评价指标

def calculate_accuracy(threshold, dist, actual_issame):
    predict_issame = np.less(dist, threshold)
    # tp统计predict_issame和actual_issame均为True的个数,即true posotive.A-P中预测对的
    tp = np.sum(np.logical_and(predict_issame, actual_issame))
    # false positive,将A-N预测为A-P
    fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame)))
    # true negtive,A-N中预测对的
    tn = np.sum(np.logical_and(np.logical_not(predict_issame), np.logical_not(actual_issame)))
    # false negtive,将A-P预测为A-N
    fn = np.sum(np.logical_and(np.logical_not(predict_issame), actual_issame))

    # ROC曲线,横轴为fpr,纵轴为tpr,曲线越接近左上角效果越好
    # 当测试集中的正负样本的分布变化的时候,ROC曲线能够保持不变
    # true positive rate  tp/p   A-P中预测对的
    tpr = 0 if (tp + fn == 0) else float(tp) / float(tp + fn)
    # false positive rate  fp/n  将A-N预测为A-P
    fpr = 0 if (fp + tn == 0) else float(fp) / float(fp + tn)
    #
    acc = float(tp + tn) / dist.size
    return tpr, fpr, acc
def calculate_val_far(threshold, dist, actual_issame):
    predict_issame = np.less(dist, threshold)
    # A-P对中预测对的
    tp = np.sum(np.logical_and(predict_issame, actual_issame))
    # 将A-N预测为A-P
    fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame)))
    # A-P对总数
    n_same = np.sum(actual_issame)
    # A-N对总数
    n_diff = np.sum(np.logical_not(actual_issame))

    val = float(tp) / float(n_same)
    far = float(fp) / float(n_diff)

    return val, far

事实上,tpr=val,fpr=far
val和far存在的意义为:Find the threshold that gives FAR =far_target

def calculate_val(thresholds, embeddings, actual_issame, far_target):
    dist, nrof_thresholds = calculate_dist(thresholds, embeddings)

    # Find the threshold that gives FAR = far_target
    far_val = np.zeros(nrof_thresholds)
    for threshold_idx, threshold in enumerate(thresholds):
        _, far_val[threshold_idx] = calculate_val_far(threshold, dist, actual_issame)
    if np.max(far_val) >= far_target:
        # interp1d一维线性插值,它可通过函数在有限个点处的取值状况,估算出函数在其他点处的近似值
        f = interpolate.interp1d(far_val, thresholds, kind='slinear')
        threshold = f(far_target)
    else:
        threshold = 0.0

    val, far = calculate_val_far(threshold, dist, actual_issame)

    return val, far , threshold

部分文件作用介绍:

一、基于mtcnn(处理人脸检测和人脸关键点定位问题)与facenet的人脸聚类

文件:facenet/contributed/cluster.py

文件:facenet/contributed/clustering.py实现了相似的功能,只是没有mtcnn进行检测

1.使用mtcnn进行人脸检测并对齐与裁剪

2.对裁剪的人脸使用facenet进行embedding

3.对embedding的特征向量使用欧式距离进行聚类

二、基于mtcnn与facenet的人脸识别(输入单张图片判断这人是谁)

文件:facenet/contributed/predict.py

1.使用mtcnn进行人脸检测并对齐与裁剪

2.对裁剪的人脸使用facenet进行embedding

3.执行predict.py进行人脸识别(需要训练好的svm模型)

三.Exports the embeddings and labels of a directory of images as numpy arrays

文件:facenet/contributed/export_embeddings.py

1.需要对数据进行对齐与裁剪做为输入数据

2.输出embeddings.npy;labels.npy;label_strings.npy

四.Performs face alignment and calculates distance between the embeddings of images

文件:facenet/src/compare.py

1.使用mtcnn进行人脸检测并对齐与裁剪

2.对裁剪的人脸使用facenet进行embedding

3.计算输入图片的距离

参考:http://blog.csdn.net/qq_36673141/article/details/78958582

你可能感兴趣的:(【人脸识别】FaceNet(二))