怕狗子的福哥

TensorFlow+FaceNet+训练模型

1、目录结构介绍
- 1.1、faceNet简介
2、模型准备
- 2.1、下载FaceNet源码
- 2.2、下载预训练模型
配置环境法1
- 2.3、数据预处理（align_dataset_mtcnn.py）
- 2.4、运行人脸比对程序（compare.py）
- 2.5、从新训练模型

1、目录结构介绍

1.1、faceNet简介

FaceNet是Google提出的用于人脸识别（recognition,k-NN)，验证（verification, two persons)，聚类（clustering, find common people among these faces)。与用于分类的神经网络（MobileNet、GoogleNet、VGG16/19、ResNet等）不同的是FaceNet选择了一种更直接的端到端的学习和分类方式，同时生成更少的参数量。

输入 Batch，有一个提取特征的深度神经网络（论文中用的是ZF-Net和Inceptio），然后对网络提取到的特征做 L2-normalization，之后通过embedding编码生成128d的向量，优化三元组损失函数得到最优模型。

总体流程：

一组图像通过分类模型（如论文中提到的ZFNet、Inception V1）提取特征。
将提取到的特征进行L2 normalize，得到embedding结果（即一张图片使用128维向量表示）
将得到的embedding结果作为输入，计算triplet loss。
l2 normalize.参考tf.nn.l2_normalize方法的实现。
output = x / sqrt(max(sum(x**2), epsilon))

漏了一个文件件介绍就是models，这个文件夹下面放的是facent采用的CNN卷积中不同网络结构的不同实现代码比如Inception-ResNet-v1.

2、模型准备

2.1、下载FaceNet源码

下载地址：facenet源码

Triplet loss training
目标：使用 Triplet loss 训练模型。
步骤：
安装 TensorFlow。
下载 davidsandberg/facenet 并设置 Python Paths。
准备训练数据，包括下载数据、Face alignment（使用 align_dataset_mtcnn.py）。
使用 train_tripletloss.py 训练。
在 TensorBoard 上查看训练过程。
- MTCNN部分：用于人脸检测和人脸对齐，输出160×160大小的图像；
- CNN部分：可以直接将人脸图像(默认输入是160×160)映射到欧几里得空间，空间距离的长度代表了人脸图像的相似性。只要该映射空间生成、人脸识别，验证和聚类等任务就可以轻松完成；

2.2、下载预训练模型

facenet提供了两个预训练模型，分别是基于CASIA-WebFace和MS-Celeb-1M人脸库训练的，不过需要去谷歌网盘下载，这里给其中一个模型的百度网盘的链接：链接百度网盘地址密码: 12mh

model.meta：模型文件，该文件保存了metagraph信息，即计算图的结构；
model.ckpt.data：权重文件，该文件保存了graph中所有遍历的数据；
model.ckpt.index：该文件保存了如何将meta和data匹配起来的信息；
pb文件：将模型文件和权重文件整合合并为一个文件，主要用途是便于发布，详细内容可以参考博客https://blog.csdn.net/yjl9122/article/details/78341689
一般情况下还会有个checkpoint文件，用于保存文件的绝对路径，告诉TF最新的检查点文件(也就是上图中后三个文件)是哪个，保存在哪里，在使用tf.train.latest_checkpoint加载的时候要用到这个信息，但是如果改变或者删除了文件中保存的路径，那么加载的时候会出错，找不到文件；

配置环境法1

将facebet文件夹加到python引入库的默认搜索路径中，将facenet文件整个复制到anaconda3安装文件目录下lib\site-packages下：

然后把剪切src目录下的文件，然后删除facenet下的所有文件，粘贴src目录下的文件到facenet下，这样做的目的是为了导入src目录下的包(这样import align.detect_face不会报错)。
在Anaconda Prompt中运行python，输入import facenet，不报错即可：

https://blog.csdn.net/xingwei_09/article/details/79161931
https://www.cnblogs.com/zyly/p/9703614.html

2.3、数据预处理（align_dataset_mtcnn.py）

MTCNN的实现主要在文件夹src/align中，文件夹的内容如下：

detect_face.py：定义了MTCNN的模型结构，由P-Net、R-Net、O-Net组成，这三个网络已经提供了预训练的模型，模型数据分别对应文件det1.npy、det2.npy、det3.npy。
align_dataset_matcnn.py：是使用MTCNN的模型进行人脸检测和对齐的入口代码

使用脚本align_dataset_mtcnn.py对LFW数据库进行人脸检测和对齐的方法通过在jupyter notebook中运行如下代码：

run src/align/align_dataset_mtcnn.py \
    datasets/lfw/raw \
    datasets/lfw/lfw_mtcnnpy_160 \
    --image_size 160  --margin 32 \
    --random_order

如果是在anaconda prompt中运行，将run改为python。

该命令会创建一个datasets/lfw/lfw_mtcnnpy_160的文件夹，并将所有对齐好的人脸图像存放到这个文件夹中，数据的结构和原先的datasets/lfw/raw一样。参数–image_size 160 --margin 32的含义是在MTCNN检测得到的人脸框的基础上缩小32像素(训练时使用的数据偏大)，并缩放到160×160大小，因此最后得到的对齐后的图像都是160×160像素的，这样的话，就成功地从原始图像中检测并对齐了人脸。

"""Performs face alignment and stores face thumbnails in the output directory."""
# MIT License
#
# Copyright (c) 2016 David Sandberg
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
 
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
 
from scipy import misc
import sys
import os
import argparse
import tensorflow as tf
import numpy as np
import facenet
import align.detect_face
import random
from time import sleep
 
#args：参数，关键字参数
def main(args):
    sleep(random.random())
    #设置对齐后的人脸图像存放的路径
    output_dir = os.path.expanduser(args.output_dir)
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    # Store some git revision info in a text file in the log directory保存一些配置参数等信息
    src_path, _ = os.path.split(os.path.realpath(__file__))
    facenet.store_revision_info(src_path, output_dir, ' '.join(sys.argv))
    #获取lfw数据集 获取每个类别名称以及该类别下所有图片的绝对路径
    dataset = facenet.get_dataset(args.input_dir)
 
    print('Creating networks and loading parameters')
    #建立MTCNN网络，并预训练（即使用训练好的网络初始化参数）
    with tf.Graph().as_default():
        gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction)
        sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False))
        with sess.as_default():
            pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None)
 
    minsize = 20  # minimum size of face
    threshold = [0.6, 0.7, 0.7]  # three steps's threshold
    factor = 0.709  # scale factor
 
    # Add a random key to the filename to allow alignment using multiple processes
    random_key = np.random.randint(0, high=99999)
    bounding_boxes_filename = os.path.join(output_dir, 'bounding_boxes_%05d.txt' % random_key)
    #每个图片中人脸所在的边界框写入记录文件中
    with open(bounding_boxes_filename, "w") as text_file:
        nrof_images_total = 0
        nrof_successfully_aligned = 0
        if args.random_order:
            random.shuffle(dataset)
        #获取每一个人，以及对应的所有图片的绝对路径
        for cls in datase
            #每一个人对应的输出文件夹
            output_class_dir = os.path.join(output_dir, cls.name)
            if not os.path.exists(output_class_dir):
                os.makedirs(output_class_dir)
                if args.random_order:
                    random.shuffle(cls.image_paths)
            #遍历每一张图片
            for image_path in cls.image_paths:
                nrof_images_total += 1
                filename = os.path.splitext(os.path.split(image_path)[1])[0]
                output_filename = os.path.join(output_class_dir, filename + '.png')
                print(image_path)
                if not os.path.exists(output_filename):
                    try:
                        img = misc.imread(image_path)
                    except (IOError, ValueError, IndexError) as e:
                        errorMessage = '{}: {}'.format(image_path, e)
                        print(errorMessage)
                    else:
                        if img.ndim < 2:
                            print('Unable to align "%s"' % image_path)
                            text_file.write('%s\n' % (output_filename))
                            continue
                        if img.ndim == 2:
                            img = facenet.to_rgb(img)
                        img = img[:, :, 0:3]
                        #人脸检测，bounding_boxes表示边界框，形状为[n,5]，5对应x1,y1,x2,y2,score
                        #人脸关键点坐标形状为[n,10]，左右眼、鼻子、左右嘴角五个位置，每个位置对应一个x和y所以有10个参数
                        bounding_boxes, _ = align.detect_face.detect_face(img, minsize, pnet, rnet, onet, threshold, factor)
                        #边界框个数
                        nrof_faces = bounding_boxes.shape[0]
                        if nrof_faces > 0:
                            #【n,4】人脸框
                            det = bounding_boxes[:, 0:4]
                            img_size = np.asarray(img.shape)[0:2]
                            if nrof_faces > 1:
                                #一张图片中检测多个人脸
                                bounding_box_size = (det[:, 2] - det[:, 0]) * (det[:, 3] - det[:, 1])
                                img_center = img_size / 2
                                offsets = np.vstack([(det[:, 0] + det[:, 2]) / 2 - img_center[1], (det[:, 1] + det[:, 3]) / 2 - img_center[0]])
                                offset_dist_squared = np.sum(np.power(offsets, 2.0), 0)
                                index = np.argmax(bounding_box_size - offset_dist_squared * 2.0)  # some extra weight on the centering
                                det = det[index, :]
                            det = np.squeeze(det)
                            bb = np.zeros(4, dtype=np.int32)
                            #边界框缩小margin区域，并进行裁切后缩放到统一尺寸
                            bb[0] = np.maximum(det[0] - args.margin / 2, 0)
                            bb[1] = np.maximum(det[1] - args.margin / 2, 0)
                            bb[2] = np.minimum(det[2] + args.margin / 2, img_size[1])
                            bb[3] = np.minimum(det[3] + args.margin / 2, img_size[0])
                            #print(bb)
                            cropped = img[bb[1]:bb[3], bb[0]:bb[2], :]
                            scaled = misc.imresize(cropped, (args.image_size, args.image_size), interp='bilinear')
                            nrof_successfully_aligned += 1
                            misc.imsave(output_filename, scaled)
                            text_file.write('%s %d %d %d %d\n' % (output_filename, bb[0], bb[1], bb[2], bb[3]))
                        else:
                            print('Unable to align "%s"' % image_path)
                            text_file.write('%s\n' % (output_filename))
 
    print('Total number of images: %d' % nrof_images_total)
    print('Number of successfully aligned images: %d' % nrof_successfully_aligned)
 
 
def parse_arguments(argv):
    #解析命令行参数
    parser = argparse.ArgumentParser()
    #定义参数 input_dir、output_dir为外部参数名
    parser.add_argument('input_dir', type=str, help='Directory with unaligned images.')
    parser.add_argument('output_dir', type=str, help='Directory with aligned face thumbnails.')
    parser.add_argument('--image_size', type=int,
                        help='Image size (height, width) in pixels.', default=182)
    parser.add_argument('--margin', type=int,
                        help='Margin for the crop around the bounding box (height, width) in pixels.', default=44)
    parser.add_argument('--random_order',
                        help='Shuffles the order of images to enable alignment using multiple processes.', action='store_true')
    parser.add_argument('--gpu_memory_fraction', type=float,
                        help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0)
    #解析
    return parser.parse_args(argv)
 
if __name__ == '__main__':
    main(parse_arguments(sys.argv[1:]))

调用align.detect_face.detect_face()函数进行人脸检测，返回校准后的人脸边界框的位置、score、以及关键点坐标；
对人脸框进行处理，从原图中裁切(先进行了边缘扩展32个像素)、以及缩放(缩放到160×160 )等，并保存相关信息到文件

def detect_face(img, minsize, pnet, rnet, onet, threshold, factor):
    # im: input image
    # minsize: minimum of faces' size
    # pnet, rnet, onet: caffemodel
    # threshold: threshold=[th1 th2 th3], th1-3 are three steps's threshold
    # fastresize: resize img from last scale (using in high-resolution images) if fastresize==true
    factor_count=0
    total_boxes=np.empty((0,9))
    points=[]
    h=img.shape[0]
    w=img.shape[1]
    #最小值 假设是250x250
    minl=np.amin([h, w])
    #假设最小人脸 minsize=20，由于我们P-Net人脸检测窗口大小为12x12，
    #因此必须缩放才能使得检测窗口检测到完整的人脸 m=0.6
    m=12.0/minsize
    minl=minl*m
    # creat scale pyramid不同尺度金字塔，保存每个尺度缩放尺度系数0.6  0.6*0.7 ...
    scales=[]
    while minl>=12:
        scales += [m*np.power(factor, factor_count)]
        minl = minl*factor
        factor_count += 1
 
    # first stage
    for j in range(len(scales)):
        #缩放图像
        scale=scales[j]
        hs=int(np.ceil(h*scale))
        ws=int(np.ceil(w*scale))
        #归一化【-1,1】之间
        im_data = imresample(img, (hs, ws))
        im_data = (im_data-127.5)*0.0078125
        img_x = np.expand_dims(im_data, 0)
        img_y = np.transpose(img_x, (0,2,1,3))
        out = pnet(img_y)
        out0 = np.transpose(out[0], (0,2,1,3))
        out1 = np.transpose(out[1], (0,2,1,3))
        #输出为【n,9】前4位为人脸框在原图中的位置，第5位为判断为人脸的概率，后4位为框回归的值
        boxes, _ = generateBoundingBox(out1[0,:,:,1].copy(), out0[0,:,:,:].copy(), scale, threshold[0])
        
        # inter-scale nms非极大值抑制，然后保存剩下的bb
        pick = nms(boxes.copy(), 0.5, 'Union')
        if boxes.size>0 and pick.size>0:
            boxes = boxes[pick,:]
            total_boxes = np.append(total_boxes, boxes, axis=0)
    #图片按照所有scale走完一遍，会得到在原图上基于不同scale的所有bb,然后对这些bb再进行一次NMS
    #并且这次的NMS的threshold要提高
    numbox = total_boxes.shape[0]
    if numbox>0:
        pick = nms(total_boxes.copy(), 0.7, 'Union')
        total_boxes = total_boxes[pick,:]
        #使用回归框校准bb,框回归：框左上角的横坐标的相对偏移、框左上角的纵坐标的相对偏移、框的宽度的误差、框的高度的误差
        regw = total_boxes[:,2]-total_boxes[:,0]
        regh = total_boxes[:,3]-total_boxes[:,1]
        qq1 = total_boxes[:,0]+total_boxes[:,5]*regw
        qq2 = total_boxes[:,1]+total_boxes[:,6]*regh
        qq3 = total_boxes[:,2]+total_boxes[:,7]*regw
        qq4 = total_boxes[:,3]+total_boxes[:,8]*regh
        #【n,8】
        total_boxes = np.transpose(np.vstack([qq1, qq2, qq3, qq4, total_boxes[:,4]]))
        #把每一个bb转换为正方形
        total_boxes = rerec(total_boxes.copy())
        total_boxes[:,0:4] = np.fix(total_boxes[:,0:4]).astype(np.int32)
        #把超过原图边界的坐标裁切以下，这时得到真正原图上bb（bounding box）的坐标
        dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h)
 
    numbox = total_boxes.shape[0]
    if numbox>0:
        # second stage R-NET对于P-NET输出的bb，缩放到24x24
        tempimg = np.zeros((24,24,3,numbox))
        for k in range(0,numbox):
            tmp = np.zeros((int(tmph[k]),int(tmpw[k]),3))
            tmp[dy[k]-1:edy[k],dx[k]-1:edx[k],:] = img[y[k]-1:ey[k],x[k]-1:ex[k],:]
            if tmp.shape[0]>0 and tmp.shape[1]>0 or tmp.shape[0]==0 and tmp.shape[1]==0:
                tempimg[:,:,:,k] = imresample(tmp, (24, 24))
            else:
                return np.empty()
        #标准化【-1,1】
        tempimg = (tempimg-127.5)*0.0078125
        #转置【n,24,24,3】
        tempimg1 = np.transpose(tempimg, (3,1,0,2))
        out = rnet(tempimg1)
        out0 = np.transpose(out[0])
        out1 = np.transpose(out[1])
        score = out1[1,:]
        ipass = np.where(score>threshold[1])
        total_boxes = np.hstack([total_boxes[ipass[0],0:4].copy(), np.expand_dims(score[ipass].copy(),1)])
        mv = out0[:,ipass[0]]
        if total_boxes.shape[0]>0:
            pick = nms(total_boxes, 0.7, 'Union')
            total_boxes = total_boxes[pick,:]
            total_boxes = bbreg(total_boxes.copy(), np.transpose(mv[:,pick]))
            total_boxes = rerec(total_boxes.copy())
 
    numbox = total_boxes.shape[0]
    if numbox>0:
        # third stage
        total_boxes = np.fix(total_boxes).astype(np.int32)
        dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h)
        tempimg = np.zeros((48,48,3,numbox))
        for k in range(0,numbox):
            tmp = np.zeros((int(tmph[k]),int(tmpw[k]),3))
            tmp[dy[k]-1:edy[k],dx[k]-1:edx[k],:] = img[y[k]-1:ey[k],x[k]-1:ex[k],:]
            if tmp.shape[0]>0 and tmp.shape[1]>0 or tmp.shape[0]==0 and tmp.shape[1]==0:
                tempimg[:,:,:,k] = imresample(tmp, (48, 48))
            else:
                return np.empty()
        tempimg = (tempimg-127.5)*0.0078125
        tempimg1 = np.transpose(tempimg, (3,1,0,2))
        out = onet(tempimg1)
        #关键点
        out0 = np.transpose(out[0])
        #框回归
        out1 = np.transpose(out[1])
        #人脸概率
        out2 = np.transpose(out[2])
        score = out2[1,:]
        points = out1
        ipass = np.where(score>threshold[2])
        points = points[:,ipass[0]]
        #[n,5]
        total_boxes = np.hstack([total_boxes[ipass[0],0:4].copy(), np.expand_dims(score[ipass].copy(),1)])
        mv = out0[:,ipass[0]]
 
        w = total_boxes[:,2]-total_boxes[:,0]+1
        h = total_boxes[:,3]-total_boxes[:,1]+1
        #人脸关键点
        points[0:5,:] = np.tile(w,(5, 1))*points[0:5,:] + np.tile(total_boxes[:,0],(5, 1))-1
        points[5:10,:] = np.tile(h,(5, 1))*points[5:10,:] + np.tile(total_boxes[:,1],(5, 1))-1
        if total_boxes.shape[0]>0:
            total_boxes = bbreg(total_boxes.copy(), np.transpose(mv))
            pick = nms(total_boxes.copy(), 0.7, 'Min')
            total_boxes = total_boxes[pick,:]
            points = points[:,pick]
    #返回bb:[n,5]x1,y1,x2,y2,score,和关键点[n,10]
    return total_boxes, points

到这里、我们的准备工作已经基本完成，测试数据集LFW，模型、程序都有了，我们接下来开始评估模型的准确率。

run src/validate_on_lfw.py \
  datasets/lfw/lfw_mtcnnpy_160 \
  models/facenet/model-20170512-110547/

2.4、运行人脸比对程序（compare.py）

在自己的数据上使用已有的模型

run src/compare.py \
    models/facenet/model-20170512-110547/ \
    test_imgs/1.jpg test_imgs/2.jpg test_imgs/3.jpg

facenet可以直接比对两个人脸经过它的网络映射之后的欧氏距离
compare.py源码如下：

 
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
 
from scipy import misc
import tensorflow as tf
import numpy as np
import sys
import os
import argparse
import facenet
import align.detect_face
 
 
def main(args):
 
    images = load_and_align_data(args.image_files, args.image_size, args.margin, args.gpu_memory_fraction)
    with tf.Graph().as_default():
 
        with tf.Session() as sess:
 
            # 载入模型
            facenet.load_model(args.model)
 
            # images_placeholder是输入图像的占位符，后面把image传给它
            images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")
            #embeddings是卷积网络最后输出的特征
            embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
            #phase_train_placeholder占位符决定了现在是不是训练阶段
            phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")
 
            # 计算特征
            feed_dict = {images_placeholder: images, phase_train_placeholder: False}
            emb = sess.run(embeddings, feed_dict=feed_dict)
            #print(emb)#可将计算的特征打印出来，facenet提取出的特征是128维的
            #nrof_images是图片总数目
            nrof_images = len(args.image_files)
            #简单地打印图片名称
            print('Images:')
            for i in range(nrof_images):
                print('%1d: %s' % (i, args.image_files[i]))
            print('')
 
            # 打印距离矩阵
            print('Distance matrix')
            print('    ', end='')
            for i in range(nrof_images):
                print('    %1d     ' % i, end='')
            print('')
            for i in range(nrof_images):
                print('%1d  ' % i, end='')
                for j in range(nrof_images):
                    dist = np.sqrt(np.sum(np.square(np.subtract(emb[i, :], emb[j, :]))))
                    print('  %1.4f  ' % dist, end='')
                print('')
 
 
def load_and_align_data(image_paths, image_size, margin, gpu_memory_fraction):
 
    minsize = 20  # minimum size of face
    threshold = [0.6, 0.7, 0.7]  # three steps's threshold
    factor = 0.709  # scale factor
 
    print('Creating networks and loading parameters')
    with tf.Graph().as_default():
        gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_memory_fraction)
        sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False))
        with sess.as_default():
            pnet, rnet, onet = align.detect_face.create_mtcnn(sess, None)
    #nrof_samples是图片总数目，image_paths存储了这些图片的路径
    nrof_samples = len(image_paths)
    #img_list中存储了对齐后的图像
    img_list = [None] * nrof_samples
    for i in range(nrof_samples):
        #读入图像
        img = misc.imread(os.path.expanduser(image_paths[i]))
        img_size = np.asarray(img.shape)[0:2]
        #使用P-NET,R-NET,O-NET,即MTCNN检测并对齐图像，检测的结果存入bounding_boxes
        bounding_boxes, _ = align.detect_face.detect_face(img, minsize, pnet, rnet, onet, threshold, factor)
        #对于检测出的bounding_boxes，减去margin（在MTCNN检测得到的人脸框的基础上缩小32像素（训练时使用的数据偏大））
        det = np.squeeze(bounding_boxes[0, 0:4])
        bb = np.zeros(4, dtype=np.int32)
        bb[0] = np.maximum(det[0] - margin / 2, 0)
        bb[1] = np.maximum(det[1] - margin / 2, 0)
        bb[2] = np.minimum(det[2] + margin / 2, img_size[1])
        bb[3] = np.minimum(det[3] + margin / 2, img_size[0])
        #裁剪出人脸区域，并缩放到卷积神经网络输入的大小
        cropped = img[bb[1]:bb[3], bb[0]:bb[2], :]
        aligned = misc.imresize(cropped, (image_size, image_size), interp='bilinear')
        prewhitened = facenet.prewhiten(aligned)
        img_list[i] = prewhitened
    images = np.stack(img_list)
    return images
 
 
def parse_arguments(argv):
    parser = argparse.ArgumentParser()
 
    parser.add_argument('model', type=str,
                        help='Could be either a directory containing the meta_file and ckpt_file or a model protobuf (.pb) file')
    parser.add_argument('image_files', type=str, nargs='+', help='Images to compare')
    parser.add_argument('--image_size', type=int,
                        help='Image size (height, width) in pixels.', default=160)
    parser.add_argument('--margin', type=int,
                        help='Margin for the crop around the bounding box (height, width) in pixels.', default=44)
    parser.add_argument('--gpu_memory_fraction', type=float,
                        help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0)
    return parser.parse_args(argv)
 
if __name__ == '__main__':
    main(parse_arguments(sys.argv[1:]))

首先使用MTCNN网络对原始测试图片进行检测和对齐，即得到[n,160,160,3]的输出；
从模型文件(meta,ckpt文件)中加载facenet网络；
把处理后的测试图片输入网络，得到每个图像的特征，对特征计算两两之间的距离以得到人脸之间的相似度；

2.5、从新训练模型

从头训练一个新模型需要非常多的数据集，这里我们以CASIA-WebFace为例，这个 dataset 在原始地址已经下载不到了，而且这个 dataset 据说有很多无效的图片，所以这里我们使用的是清理过的数据库。该数据库可以在百度网盘有下载：下载地址，提取密码为 3zbb

这个数据库有 10575 个类别494414张图像，每个类别都有各自的文件夹，里面有同一个人的几张或者几十张不等的脸部图片。我们先利用MTCNN 从这些照片中把人物的脸框出来，然后交给下面的 Facenet 去训练。
下载好之后，解压到datasets/casia/raw目录下

其中每个文件夹代表一个人，文件夹保存这个人的所有人脸图片。与LFW数据集类似，我们先利用MTCNN对原始图像进行人脸检测和对齐

run src/align/align_dataset_mtcnn.py \
    datasets/casia/raw \
    datasets/casia/casia_maxpy_mtcnnpy_182 \
    --image_size 182  --margin 44 \

对齐后的图像保存在路径datasets/casia/casia_maxpy_mtcnnpy_182下，每张图像的大小都是182×182。而最终网络的输入是160×160，之所以先生成182×182的图像，是为了留出一定的空间给数据增强的裁切环节。我们会在182×182的图像上随机裁切出160×160的区域，再送入神经网络进行训练。
运行如下命令进行训练
train_softmax.py 或train_tripletloss.py

run src/train_softmax.py \
    --logs_base_dir logs/facenet/ \
    --models_base_dir models/facenet/ \
    --data_dir datasets/casia/casia_maxpy_mtcnnpy_182 \
    --image_size 160 \
    --model_def models.inception_resnet_v1 \
    --lfw_dir datasets/lfw/lfw_mtcnnpy_160 \
    --optimizer RMSPROP \
    --learning_rate -1 \
    --max_nrof_epochs 80 \
    --keep_probability 0.8 \
    --random_crop --random_flip \
    --learning_rate_schedule_file data/learning_rate_schedule_classifier_casia.txt \
    --weight_decay 5e-5 \
    --center_loss_factor 1e-2 \
    --center_loss_alfa 0.9

上面命令中有很多参数，我们来一一介绍。首先是文件src/train_softmax.py文件，它采用中心损失和softmax损失结合来训练模型，其中参数如下：
也就是说一开始一直使用0.1作为

logs_base_dir./logs：将会把训练日志保存到./logs中，在运行时，会在./logs文件夹下新建一个以当前时间命名的文讲夹。最终的日志会保存在这个文件夹中，所谓的日志文件，实际上指的是tf中的events文件，它主要包含当前损失、当前训练步数、当前学习率等信息。后面我们会使用TensorBoard查看这些信息；
–models_base_dir ./models：最终训练好的模型保存在./models文件夹下，在运行时，会在./models文件夹下新建一个以当前时间命名的文讲夹，并用来保存训练好的模型；
data_dir …/datasets/casis/casia_maxpy_mtcnnpy_182：指定训练所使用的数据集的路径，这里使用的就是刚才对齐好的CASIA-WebFace人脸数据；
image_size 160：输入网络的图片尺寸是160×160大小；
mode_def models.inception_resnet_v1：指定了训练所使用的卷积网络是inception_resnet_v1网络。项目所支持的网络在src/models目录下，包含inception_resnet_v1，inception_resnet_v2和squeezenet三个模型，前两个模型较大，最后一个模型较小。如果在训练时出现内存或者显存不足的情况可以尝试使用sequeezenet网络，也可以修改batch_size 大小为32或者64(默认是90)；
lfw_dir …/datasets/lfw/lfw_mtcnnpy_160：指定了LFW数据集的路径。如果指定了这个参数，那么每训练完一个epoch，就会在LFW数据集上执行一次测试，并将测试的准确率写入到日志文件中；
optimizer RMSPROP ：指定训练使用的优化方法；
learning_rate -1：指定学习率，指定了负数表示忽略这个参数，而使用后面的–learning_rate_schedule_file参数规划学习率；
max_nrof_epochs 80：指定训练轮数epoch；
keep_probability 0.8：在全连接层，加入dropout，这个参数表示dropout中连接被保持的概率。
random_crop：表明在数据增强时使用随机裁切；
random_flip ：表明在数据增强时使用随机翻转；
learning_rate_schedule_file data/learning_rate_schedule_classifier_casia.txt：在之前指定了–learning_rate -1，因此最终的学习率将由参数–learning_rate_schedule_file决定。这个参数指定一个文件data/learning_rate_schedule_classifier_casia.txt，该文件内容如下：

# Learning rate schedule
# Maps an epoch number to a learning rate
0:  0.05
60: 0.005
80: 0.0005
91: -1

weight_decay 5e-5：正则化系数；
center_loss_factor 1e-2 ：中心损失和Softmax损失的平衡系数；
center_loss_alfa 0.9：中心损失的内部参数；

除了上面我们使用到的参数，还有许多参数，下面介绍一些比较重要的：

pretrained_model ：models/20180408-102900 预训练模型，使用预训练模型可以加快训练速度(微调时经常使用到)；
batch_size：batch大小，越大，需要的内存也会越大；
random_rotate：表明在数据增强时使用随机旋转；

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from datetime import datetime
import os.path
import time
import sys
import tensorflow as tf
import numpy as np
import importlib
import itertools
import argparse
import facenet
import lfw

from tensorflow.python.ops import data_flow_ops

from six.moves import xrange  # @UnresolvedImport

def main(args):
    # 载入网络结构
    network = importlib.import_module(args.model_def)

    # 为本次运行新建日志目录与模型目录
    subdir = datetime.strftime(datetime.now(), '%Y%m%d-%H%M%S')
    log_dir = os.path.join(os.path.expanduser(args.logs_base_dir), subdir)
    if not os.path.isdir(log_dir):  # Create the log directory if it doesn't exist
        os.makedirs(log_dir)

    #新建保存模型的文件夹
    model_dir = os.path.join(os.path.expanduser(args.models_base_dir), subdir)
    if not os.path.isdir(model_dir):  # Create the model directory if it doesn't exist
        os.makedirs(model_dir)

    # Write arguments to a text file
    # 保存输入参数信息
    facenet.write_arguments_to_file(args, os.path.join(log_dir, 'arguments.txt'))
        
    # Store some git revision info in a text file in the log directory
    # 保存git相关信息
    src_path,_ = os.path.split(os.path.realpath(__file__))
    facenet.store_revision_info(src_path, log_dir, ' '.join(sys.argv))

    # 获取数据集
    # PS：要求数据集按照给定规则存放
    np.random.seed(seed=args.seed)
    # 获取数据集,通过get_dataset获取的train_set是包含文件路径与标签的集合
    train_set = facenet.get_dataset(args.data_dir)

    # pre-trained model 路径
    print('Model directory: %s' % model_dir)
    print('Log directory: %s' % log_dir)
    if args.pretrained_model:
        print('Pre-trained model: %s' % os.path.expanduser(args.pretrained_model))

    # lfw 测试集
    if args.lfw_dir:
        print('LFW directory: %s' % args.lfw_dir)
        # Read the file containing the pairs used for testing
        pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs))
        # Get the paths for the corresponding images
        lfw_paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs)

    # 建立图运行上下文
    with tf.Graph().as_default():
        tf.set_random_seed(args.seed)
        global_step = tf.Variable(0, trainable=False)

        # Placeholder for the learning rate
        # 创建各种 placeholder
        learning_rate_placeholder = tf.placeholder(tf.float32, name='learning_rate')
        
        batch_size_placeholder = tf.placeholder(tf.int32, name='batch_size')
        
        phase_train_placeholder = tf.placeholder(tf.bool, name='phase_train')

        # 为什么 shape 是 (None, 3) ？因为使用了 triplet loss，输入需要3张图片
        image_paths_placeholder = tf.placeholder(tf.string, shape=(None,3), name='image_paths')
        labels_placeholder = tf.placeholder(tf.int64, shape=(None,3), name='labels')

        # 数据集准备
        #新建一个队列,数据流操作,fifo,先入先出
        input_queue = data_flow_ops.FIFOQueue(capacity=100000,
                                    dtypes=[tf.string, tf.int64],
                                    shapes=[(3,), (3,)],
                                    shared_name=None, name=None)
        # 文件名入队操作
        # enqueue_many返回的是一个操作
        enqueue_op = input_queue.enqueue_many([image_paths_placeholder, labels_placeholder])

        # 读取队列中数据，经过一系列转换（读取图片、切片、水平镜像、reshape）
        # 得到的结果是[images, label]，其中images的shape为(3, image_size, image_size, 3), label的shape为(3,)
        nrof_preprocess_threads = 4
        images_and_labels = []
        for _ in range(nrof_preprocess_threads):
            filenames, label = input_queue.dequeue()
            images = []
            for filename in tf.unstack(filenames):
                file_contents = tf.read_file(filename)
                image = tf.image.decode_image(file_contents, channels=3)
                
                if args.random_crop:
                    image = tf.random_crop(image, [args.image_size, args.image_size, 3])
                else:
                    image = tf.image.resize_image_with_crop_or_pad(image, args.image_size, args.image_size)
                if args.random_flip:
                    image = tf.image.random_flip_left_right(image)
    
                #pylint: disable=no-member
                image.set_shape((args.image_size, args.image_size, 3))
                images.append(tf.image.per_image_standardization(image))
            images_and_labels.append([images, label])

        # 由于 enqueue_many=True
        # 因此输入队列 images_and_labels 中的每个元素，即[images, label]也会根据axis=0的切分为多条数据
        # image_batch 的 shape 为 (batch_size, image_size, image_size, 3)
        # labels_batch 的 shape 为 (batch_size)
        image_batch, labels_batch = tf.train.batch_join(
            images_and_labels, batch_size=batch_size_placeholder, 
            shapes=[(args.image_size, args.image_size, 3), ()], enqueue_many=True,
            capacity=4 * nrof_preprocess_threads * args.batch_size,
            allow_smaller_final_batch=True)
        image_batch = tf.identity(image_batch, 'image_batch')
        image_batch = tf.identity(image_batch, 'input')
        labels_batch = tf.identity(labels_batch, 'label_batch')

        # Build the inference graph
        # 将图片转换为长度为128的向量，并进行l2 normalize操作
        # 创建网络图:除了全连接层和损失层
        prelogits, _ = network.inference(image_batch, args.keep_probability, 
            phase_train=phase_train_placeholder, bottleneck_layer_size=args.embedding_size,
            weight_decay=args.weight_decay)
        
        embeddings = tf.nn.l2_normalize(prelogits, 1, 1e-10, name='embeddings')
        # Split embeddings into anchor, positive and negative and calculate triplet loss
        # 获取triplet输入数据，并计算损失函数
        anchor, positive, negative = tf.unstack(tf.reshape(embeddings, [-1,3,args.embedding_size]), 3, 1)
        triplet_loss = facenet.triplet_loss(anchor, positive, negative, args.alpha)

        # 获取学习率
        # 将指数衰减应用到学习率上
        learning_rate = tf.train.exponential_decay(learning_rate_placeholder, global_step,
            args.learning_rate_decay_epochs*args.epoch_size, args.learning_rate_decay_factor, staircase=True)
        # saver summary相关
        tf.summary.scalar('learning_rate', learning_rate)

        # Calculate the total losses
        # 根据REGULARIZATION_LOSSES返回一个收集器中所收集的值的列表
        regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)

        # 我们选择L2-正则化来实现这一点，L2正则化将网络中所有权重的平方和加到损失函数。如果模型使用大权重，则对应重罚分，并且如果模型使用小权重，则小罚分。
        # 这就是为什么我们在定义权重时使用了regularizer参数，并为它分配了一个l2_regularizer。这告诉了TensorFlow要跟踪
        # l2_regularizer这个变量的L2正则化项（并通过参数reg_constant对它们进行加权）。
        # 所有正则化项被添加到一个损失函数可以访问的集合——tf.GraphKeys.REGULARIZATION_LOSSES。
        # 将所有正则化损失的总和与先前计算的triplet_loss相加，以得到我们的模型的总损失。

        total_loss = tf.add_n([triplet_loss] + regularization_losses, name='total_loss')

        # Build a Graph that trains the model with one batch of examples and updates the model parameters
        train_op = facenet.train(total_loss, global_step, args.optimizer, 
            learning_rate, args.moving_average_decay, tf.global_variables())
        
        # Create a saver
        # saver summary相关
        saver = tf.train.Saver(tf.trainable_variables(), max_to_keep=3)

        # Build the summary operation based on the TF collection of Summaries.
        summary_op = tf.summary.merge_all()

        # Start running operations on the Graph.
        # 创建Session并进行变量初始化
        gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=args.gpu_memory_fraction)
        sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))        

        # Initialize variables
        sess.run(tf.global_variables_initializer(), feed_dict={phase_train_placeholder:True})
        sess.run(tf.local_variables_initializer(), feed_dict={phase_train_placeholder:True})

        # 运行输入数据队列，并获取FileWriter
        summary_writer = tf.summary.FileWriter(log_dir, sess.graph)
        coord = tf.train.Coordinator()
        tf.train.start_queue_runners(coord=coord, sess=sess)

        with sess.as_default():
            # 导入 pre-trained model
            if args.pretrained_model:
                print('Restoring pretrained model: %s' % args.pretrained_model)
                saver.restore(sess, os.path.expanduser(args.pretrained_model))

            # Training and validation loop
            epoch = 0
            # 将所有数据过一遍的次数
            while epoch < args.max_nrof_epochs:
                # 这里是返回当前的global_step值吗,step可以看做是全局的批处理个数
                step = sess.run(global_step, feed_dict=None)
                # epoch_size是一个epoch中批的个数
                # 这个epoch是全局的批处理个数除以一个epoch中批的个数得到epoch,这个epoch将用于求学习率
                epoch = step // args.epoch_size
                # Train for one epoch
                train(args, sess, train_set, epoch, image_paths_placeholder, labels_placeholder, labels_batch,
                    batch_size_placeholder, learning_rate_placeholder, phase_train_placeholder, enqueue_op, input_queue, global_step, 
                    embeddings, total_loss, train_op, summary_op, summary_writer, args.learning_rate_schedule_file,
                    args.embedding_size, anchor, positive, negative, triplet_loss)

                # Save variables and the metagraph if it doesn't exist already
                save_variables_and_metagraph(sess, saver, summary_writer, model_dir, subdir, step)

                # Evaluate on LFW
                if args.lfw_dir:
                    evaluate(sess, lfw_paths, embeddings, labels_batch, image_paths_placeholder, labels_placeholder, 
                            batch_size_placeholder, learning_rate_placeholder, phase_train_placeholder, enqueue_op, actual_issame, args.batch_size, 
                            args.lfw_nrof_folds, log_dir, step, summary_writer, args.embedding_size)

    return model_dir


def train(args, sess, dataset, epoch, image_paths_placeholder, labels_placeholder, labels_batch,
          batch_size_placeholder, learning_rate_placeholder, phase_train_placeholder, enqueue_op, input_queue, global_step, 
          embeddings, loss, train_op, summary_op, summary_writer, learning_rate_schedule_file,
          embedding_size, anchor, positive, negative, triplet_loss):
    batch_number = 0

    # 获取学习率
    if args.learning_rate>0.0:
        lr = args.learning_rate
    else:
        lr = facenet.get_learning_rate_from_file(learning_rate_schedule_file, epoch)
        # 每个 epoch 要训练 epoch_size 个 batch
    while batch_number < args.epoch_size:
        # Sample people randomly from the dataset
        # 从数据集中获取数据
        # 确保数据总量为 args.people_per_batch * args.images_per_person
        # 输出为两个列表 image_paths, num_per_class
        # 其中 len(image_paths) == len(num_per_class)，代表数据集中类别数量
        # 分类的数量不一定是 args.people_per_batch
        # 从dataset中随机抽取样本数据
        image_paths, num_per_class = sample_people(dataset, args.people_per_batch, args.images_per_person)
        
        print('Running forward pass on sampled images: ', end='')
        start_time = time.time()
        # 样本个数nrof_examples=batch中的人数people_per_batch×每个人对应的照片数量images_per_person
        nrof_examples = args.people_per_batch * args.images_per_person
        # 这里将shape转换为(-1, 3)主要是计算图中的定义，而非实际需要
        labels_array = np.reshape(np.arange(nrof_examples),(-1,3))
        image_paths_array = np.reshape(np.expand_dims(np.array(image_paths),1), (-1,3))
        # 进行enqueue操作
        sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array})
        # 读取数据集中数据并进行操作，得到embedding向量
        # 结果保存在 emb_array 中
        emb_array = np.zeros((nrof_examples, embedding_size))
        nrof_batches = int(np.ceil(nrof_examples / args.batch_size))
        # nrof_batches代表需要运行的batch数=nrof_triplets×3（图片数量）/batch_size
        for i in range(nrof_batches):
            batch_size = min(nrof_examples-i*args.batch_size, args.batch_size)
            emb, lab = sess.run([embeddings, labels_batch], feed_dict={batch_size_placeholder: batch_size, 
                learning_rate_placeholder: lr, phase_train_placeholder: True})
            emb_array[lab,:] = emb
        print('%.3f' % (time.time()-start_time))

        # Select triplets based on the embeddings
        # 进行 select triblets 操作image
        # 其中 len(triplets) == nrof_triplets
        # triplets 是个队列，每个元素是个三元组，分别代表 anchor positive negative 的 image_path
        print('Selecting suitable triplets for training')
        triplets, nrof_random_negs, nrof_triplets = select_triplets(emb_array, num_per_class, 
            image_paths, args.people_per_batch, args.alpha)
        selection_time = time.time() - start_time
        print('(nrof_random_negs, nrof_triplets) = (%d, %d): time=%.3f seconds' % 
            (nrof_random_negs, nrof_triplets, selection_time))

        # Perform training on the selected triplets
        # 基于刚刚获取的 triplets 进行训练
        nrof_batches = int(np.ceil(nrof_triplets*3/args.batch_size))
        triplet_paths = list(itertools.chain(*triplets))
        labels_array = np.reshape(np.arange(len(triplet_paths)),(-1,3))
        triplet_paths_array = np.reshape(np.expand_dims(np.array(triplet_paths),1), (-1,3))
        sess.run(enqueue_op, {image_paths_placeholder: triplet_paths_array, labels_placeholder: labels_array})
        nrof_examples = len(triplet_paths)
        train_time = 0
        i = 0
        emb_array = np.zeros((nrof_examples, embedding_size))
        loss_array = np.zeros((nrof_triplets,))
        summary = tf.Summary()
        step = 0
        while i < nrof_batches:
            start_time = time.time()
            batch_size = min(nrof_examples-i*args.batch_size, args.batch_size)
            feed_dict = {batch_size_placeholder: batch_size, learning_rate_placeholder: lr, phase_train_placeholder: True}
            err, _, step, emb, lab = sess.run([loss, train_op, global_step, embeddings, labels_batch], feed_dict=feed_dict)
            emb_array[lab,:] = emb
            loss_array[i] = err
            duration = time.time() - start_time
            print('Epoch: [%d][%d/%d]\tTime %.3f\tLoss %2.3f' %
                  (epoch, batch_number+1, args.epoch_size, duration, err))
            batch_number += 1
            i += 1
            train_time += duration
            summary.value.add(tag='loss', simple_value=err)
            
        # Add validation loss and accuracy to summary
        #pylint: disable=maybe-no-member
        summary.value.add(tag='time/selection', simple_value=selection_time)
        summary_writer.add_summary(summary, step)
    return step
  
def select_triplets(embeddings, nrof_images_per_class, image_paths, people_per_batch, alpha):
    """ Select the triplets for training
    """
    trip_idx = 0
    # 某个人的图片的embedding在emb_arr中的开始索引
    emb_start_idx = 0
    num_trips = 0
    triplets = []
    
    # VGG Face: Choosing good triplets is crucial and should strike a balance between
    #  selecting informative (i.e. challenging) examples and swamping training with examples that
    #  are too hard. This is achieve by extending each pair (a, p) to a triplet (a, p, n) by sampling
    #  the image n at random, but only between the ones that violate the triplet loss margin. The
    #  latter is a form of hard-negative mining, but it is not as aggressive (and much cheaper) than
    #  choosing the maximally violating example, as often done in structured output learning.

    # 遍历每一个人
    for i in xrange(people_per_batch):
        # 这个人有多少张图片
        nrof_images = int(nrof_images_per_class[i])
        # 遍历第i个人所有照片对应的embedding特征
        for j in xrange(1,nrof_images):
            # a_idx表示第j张图片在emb_arr中的位置
            a_idx = emb_start_idx + j - 1
            # neg_dists_sqr表示这张图片跟其他所有图片的欧式距离
            neg_dists_sqr = np.sum(np.square(embeddings[a_idx] - embeddings), 1)
            for pair in xrange(j, nrof_images): # For every possible positive pair.
                # 在同一个人下我们找postive图片
                p_idx = emb_start_idx + pair
                # （a，p）之间的欧式距离
                pos_dist_sqr = np.sum(np.square(embeddings[a_idx]-embeddings[p_idx]))
                # 同一个人的图片不作为negative ，所以将其另为nan
                neg_dists_sqr[emb_start_idx:emb_start_idx+nrof_images] = np.NaN
                #all_neg = np.where(np.logical_and(neg_dists_sqr-pos_dist_sqr
                # 找到其他人的图片满足条件：（a,n）距离-（a,p）距离
                all_neg = np.where(neg_dists_sqr-pos_dist_sqr<alpha)[0] # VGG Face selecction
                # nrof_random_negs表示满足上述条件的emb_arr索引
                nrof_random_negs = all_neg.shape[0]
                # 如果有满足条件的negative
                if nrof_random_negs>0:
                    # 从中随机选取一个作为negative
                    rnd_idx = np.random.randint(nrof_random_negs)
                    n_idx = all_neg[rnd_idx]
                    # 将anchor的图片路径，postive的图片路径，negative的图片路径放入三元组triplets中
                    triplets.append((image_paths[a_idx], image_paths[p_idx], image_paths[n_idx]))
                    #print('Triplet %d: (%d, %d, %d), pos_dist=%2.6f, neg_dist=%2.6f (%d, %d, %d, %d, %d)' % 
                    #    (trip_idx, a_idx, p_idx, n_idx, pos_dist_sqr, neg_dists_sqr[n_idx], nrof_random_negs, rnd_idx, i, j, emb_start_idx))
                    trip_idx += 1

                num_trips += 1

        emb_start_idx += nrof_images

    np.random.shuffle(triplets)
    return triplets, num_trips, len(triplets)

def sample_people(dataset, people_per_batch, images_per_person):
    # nrof_images为总共抽取图片数量
    # people_per_batch为每一个batch抽样多少人
    # images_per_person为没人抽取多少张
    nrof_images = people_per_batch * images_per_person
  
    # Sample classes from the dataset
    # nrof_classes表示dataset共有多少人的图片
    nrof_classes = len(dataset)
    # class_indices为每个人的索引
    class_indices = np.arange(nrof_classes)
    # 随机打乱索引，为下一步抽样做准备
    np.random.shuffle(class_indices)
    
    i = 0
    # 保存抽样出来的图片途径
    image_paths = []
    # 每个人抽取的图片个数
    num_per_class = []
    # 抽样的图片是属于哪一个人的，作为label
    sampled_class_indices = []
    # Sample images from these classes until we have enough
    # 不断进行抽样达到指定的数量为止
    while len(image_paths)<nrof_images:
        # class_index为第i个人的标签
        class_index = class_indices[i]
        # nrof_images_in_class为第i个人的图片总数
        nrof_images_in_class = len(dataset[class_index])
        # image_indices为图片索引
        image_indices = np.arange(nrof_images_in_class)
        # 随机打乱第i人的图片索引数组
        np.random.shuffle(image_indices)
        # nrof_images_from_class表示应该从该人物中抽取的图片数量
        nrof_images_from_class = min(nrof_images_in_class, images_per_person, nrof_images-len(image_paths))
        idx = image_indices[0:nrof_images_from_class]
        # 从该人中抽取人物对应图片的路径
        image_paths_for_class = [dataset[class_index].image_paths[j] for j in idx]
        # 将抽样出来的图像所对应的label加到sampled_class_indices中
        sampled_class_indices += [class_index]*nrof_images_from_class
        # 将抽样出来的图片路径加到image_paths中
        image_paths += image_paths_for_class
        # 每个人物抽取啊图片的数量信息加到num_per_class中
        num_per_class.append(nrof_images_from_class)
        i+=1
    # 返回抽取的图片路径和每个人所对应的照片数量
    return image_paths, num_per_class

def evaluate(sess, image_paths, embeddings, labels_batch, image_paths_placeholder, labels_placeholder, 
        batch_size_placeholder, learning_rate_placeholder, phase_train_placeholder, enqueue_op, actual_issame, batch_size, 
        nrof_folds, log_dir, step, summary_writer, embedding_size):
    start_time = time.time()
    # Run forward pass to calculate embeddings
    print('Running forward pass on LFW images: ', end='')
    
    nrof_images = len(actual_issame)*2
    assert(len(image_paths)==nrof_images)
    labels_array = np.reshape(np.arange(nrof_images),(-1,3))
    image_paths_array = np.reshape(np.expand_dims(np.array(image_paths),1), (-1,3))
    sess.run(enqueue_op, {image_paths_placeholder: image_paths_array, labels_placeholder: labels_array})
    emb_array = np.zeros((nrof_images, embedding_size))
    nrof_batches = int(np.ceil(nrof_images / batch_size))
    label_check_array = np.zeros((nrof_images,))
    for i in xrange(nrof_batches):
        batch_size = min(nrof_images-i*batch_size, batch_size)
        emb, lab = sess.run([embeddings, labels_batch], feed_dict={batch_size_placeholder: batch_size,
            learning_rate_placeholder: 0.0, phase_train_placeholder: False})
        emb_array[lab,:] = emb
        label_check_array[lab] = 1
    print('%.3f' % (time.time()-start_time))
    
    assert(np.all(label_check_array==1))
    
    _, _, accuracy, val, val_std, far = lfw.evaluate(emb_array, actual_issame, nrof_folds=nrof_folds)
    
    print('Accuracy: %1.3f+-%1.3f' % (np.mean(accuracy), np.std(accuracy)))
    print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far))
    lfw_time = time.time() - start_time
    # Add validation loss and accuracy to summary
    summary = tf.Summary()
    #pylint: disable=maybe-no-member
    summary.value.add(tag='lfw/accuracy', simple_value=np.mean(accuracy))
    summary.value.add(tag='lfw/val_rate', simple_value=val)
    summary.value.add(tag='time/lfw', simple_value=lfw_time)
    summary_writer.add_summary(summary, step)
    with open(os.path.join(log_dir,'lfw_result.txt'),'at') as f:
        f.write('%d\t%.5f\t%.5f\n' % (step, np.mean(accuracy), val))

def save_variables_and_metagraph(sess, saver, summary_writer, model_dir, model_name, step):
    # Save the model checkpoint
    print('Saving variables')
    start_time = time.time()
    checkpoint_path = os.path.join(model_dir, 'model-%s.ckpt' % model_name)
    saver.save(sess, checkpoint_path, global_step=step, write_meta_graph=False)
    save_time_variables = time.time() - start_time
    print('Variables saved in %.2f seconds' % save_time_variables)
    metagraph_filename = os.path.join(model_dir, 'model-%s.meta' % model_name)
    save_time_metagraph = 0  
    if not os.path.exists(metagraph_filename):
        print('Saving metagraph')
        start_time = time.time()
        saver.export_meta_graph(metagraph_filename)
        save_time_metagraph = time.time() - start_time
        print('Metagraph saved in %.2f seconds' % save_time_metagraph)
    summary = tf.Summary()
    #pylint: disable=maybe-no-member
    summary.value.add(tag='time/save_variables', simple_value=save_time_variables)
    summary.value.add(tag='time/save_metagraph', simple_value=save_time_metagraph)
    summary_writer.add_summary(summary, step)
  
  
def get_learning_rate_from_file(filename, epoch):
    with open(filename, 'r') as f:
        for line in f.readlines():
            line = line.split('#', 1)[0]
            if line:
                par = line.strip().split(':')
                e = int(par[0])
                lr = float(par[1])
                if e <= epoch:
                    learning_rate = lr
                else:
                    return learning_rate
    

def parse_arguments(argv):
    parser = argparse.ArgumentParser()
    
    parser.add_argument('--logs_base_dir', type=str, 
        help='Directory where to write event logs.', default='~/logs/facenet')
    parser.add_argument('--models_base_dir', type=str,
        help='Directory where to write trained models and checkpoints.', default='~/models/facenet')
    parser.add_argument('--gpu_memory_fraction', type=float,
        help='Upper bound on the amount of GPU memory that will be used by the process.', default=1.0)
    parser.add_argument('--pretrained_model', type=str,
        help='Load a pretrained model before training starts.')
    parser.add_argument('--data_dir', type=str,
        help='Path to the data directory containing aligned face patches.',
        default='~/datasets/casia/casia_maxpy_mtcnnalign_182_160')
    parser.add_argument('--model_def', type=str,
        help='Model definition. Points to a module containing the definition of the inference graph.', default='models.inception_resnet_v1')
    parser.add_argument('--max_nrof_epochs', type=int,
        help='Number of epochs to run.', default=500)
    parser.add_argument('--batch_size', type=int,
        help='Number of images to process in a batch.', default=90)
    parser.add_argument('--image_size', type=int,
        help='Image size (height, width) in pixels.', default=160)
    parser.add_argument('--people_per_batch', type=int,
        help='Number of people per batch.', default=45)
    parser.add_argument('--images_per_person', type=int,
        help='Number of images per person.', default=40)
    parser.add_argument('--epoch_size', type=int,
        help='Number of batches per epoch.', default=1000)
    parser.add_argument('--alpha', type=float,
        help='Positive to negative triplet distance margin.', default=0.2)
    parser.add_argument('--embedding_size', type=int,
        help='Dimensionality of the embedding.', default=128)
    parser.add_argument('--random_crop', 
        help='Performs random cropping of training images. If false, the center image_size pixels from the training images are used. ' +
         'If the size of the images in the data directory is equal to image_size no cropping is performed', action='store_true')
    parser.add_argument('--random_flip', 
        help='Performs random horizontal flipping of training images.', action='store_true')
    parser.add_argument('--keep_probability', type=float,
        help='Keep probability of dropout for the fully connected layer(s).', default=1.0)
    parser.add_argument('--weight_decay', type=float,
        help='L2 weight regularization.', default=0.0)
    parser.add_argument('--optimizer', type=str, choices=['ADAGRAD', 'ADADELTA', 'ADAM', 'RMSPROP', 'MOM'],
        help='The optimization algorithm to use', default='ADAGRAD')
    parser.add_argument('--learning_rate', type=float,
        help='Initial learning rate. If set to a negative value a learning rate ' +
        'schedule can be specified in the file "learning_rate_schedule.txt"', default=0.1)
    parser.add_argument('--learning_rate_decay_epochs', type=int,
        help='Number of epochs between learning rate decay.', default=100)
    parser.add_argument('--learning_rate_decay_factor', type=float,
        help='Learning rate decay factor.', default=1.0)
    parser.add_argument('--moving_average_decay', type=float,
        help='Exponential decay for tracking of training parameters.', default=0.9999)
    parser.add_argument('--seed', type=int,
        help='Random seed.', default=666)
    parser.add_argument('--learning_rate_schedule_file', type=str,
        help='File containing the learning rate schedule that is used when learning_rate is set to to -1.', default='data/learning_rate_schedule.txt')

    # Parameters for validation on LFW
    parser.add_argument('--lfw_pairs', type=str,
        help='The file containing the pairs to use for validation.', default='data/pairs.txt')
    parser.add_argument('--lfw_dir', type=str,
        help='Path to the data directory containing aligned face patches.', default='')
    parser.add_argument('--lfw_nrof_folds', type=int,
        help='Number of folds to use for cross validation. Mainly used for testing.', default=10)
    return parser.parse_args(argv)
  

if __name__ == '__main__':
    main(parse_arguments(sys.argv[1:]))

参考文章：
1、MTCNN人脸检测与对齐和FaceNet人脸识别
2、[人脸检测MTCNN和人脸识别Facenet(附源码)]
(https://www.cnblogs.com/zyly/p/9703614.html)
3、官方Wiki：大概介绍了提供的集中功能以及对应的步骤

你可能感兴趣的:(计算机视觉,深度学习)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
将cmd中命令输出保存为txt文本文件落难Coder Windows cmd window
最近深度学习本地的训练中我们常常要在命令行中运行自己的代码，无可厚非，我们有必要保存我们的炼丹结果，但是复制命令行输出到txt是非常麻烦的，其实Windows下的命令行为我们提供了相应的操作。其基本的调用格式就是：运行指令>输出到的文件名称或者具体保存路径测试下，我打开cmd并且ping一下百度：pingwww.baidu.com>./data.txt看下相同目录下data.txt的输出：如果你再
推荐3家毕业AI论文可五分钟一键生成！文末附免费教程！小猪包333 写论文人工智能 AI写作深度学习计算机视觉
在当前的学术研究和写作领域，AI论文生成器已经成为许多研究人员和学生的重要工具。这些工具不仅能够帮助用户快速生成高质量的论文内容，还能进行内容优化、查重和排版等操作。以下是三款值得推荐的AI论文生成器：千笔-AIPassPaper、懒人论文以及AIPaperPass。千笔-AIPassPaper千笔-AIPassPaper是一款基于深度学习和自然语言处理技术的AI写作助手，旨在帮助用户快速生成高质
AI大模型的架构演进与最新发展季风泯灭的季节 AI大模型应用技术二人工智能架构
随着深度学习的发展，AI大模型（LargeLanguageModels,LLMs）在自然语言处理、计算机视觉等领域取得了革命性的进展。本文将详细探讨AI大模型的架构演进，包括从Transformer的提出到GPT、BERT、T5等模型的历史演变，并探讨这些模型的技术细节及其在现代人工智能中的核心作用。一、基础模型介绍：Transformer的核心原理Transformer架构的背景在Transfo
[实践应用] 深度学习之模型性能评估指标 YuanDaima2048 深度学习工具使用深度学习人工智能损失函数性能评估 pytorch python 机器学习
文章总览：YuanDaiMa2048博客文章总览深度学习之模型性能评估指标分类任务回归任务排序任务聚类任务生成任务其他介绍在机器学习和深度学习领域，评估模型性能是一项至关重要的任务。不同的学习任务需要不同的性能指标来衡量模型的有效性。以下是对一些常见任务及其相应的性能评估指标的详细解释和总结。分类任务分类任务是指模型需要将输入数据分配到预定义的类别或标签中。以下是分类任务中常用的性能指标：准确率(
[实践应用] 深度学习之优化器 YuanDaima2048 深度学习工具使用 pytorch 深度学习人工智能机器学习 python 优化器
文章总览：YuanDaiMa2048博客文章总览深度学习之优化器1.随机梯度下降（SGD）2.动量优化（Momentum）3.自适应梯度（Adagrad）4.自适应矩估计（Adam）5.RMSprop总结其他介绍在深度学习中，优化器用于更新模型的参数，以最小化损失函数。常见的优化函数有很多种，下面是几种主流的优化器及其特点、原理和PyTorch实现：1.随机梯度下降（SGD）原理:随机梯度下降通过
生成式地图制图 Bwywb_3 深度学习机器学习深度学习生成对抗网络
生成式地图制图（GenerativeCartography）是一种利用生成式算法和人工智能技术自动创建地图的技术。它结合了传统的地理信息系统（GIS）技术与现代生成模型（如深度学习、GANs等），能够根据输入的数据自动生成符合需求的地图。这种方法在城市规划、虚拟环境设计、游戏开发等多个领域具有应用前景。主要特点：自动化生成：通过算法和模型，系统能够根据输入的地理或空间数据自动生成地图，而无需人工逐
吴恩达深度学习笔记(30)-正则化的解释极客Array
正则化（Regularization）深度学习可能存在过拟合问题——高方差，有两个解决方法，一个是正则化，另一个是准备更多的数据，这是非常可靠的方法，但你可能无法时时刻刻准备足够多的训练数据或者获取更多数据的成本很高，但正则化通常有助于避免过拟合或减少你的网络误差。如果你怀疑神经网络过度拟合了数据，即存在高方差问题，那么最先想到的方法可能是正则化，另一个解决高方差的方法就是准备更多数据，这也是非常
个人学习笔记7-6：动手学深度学习pytorch版-李沐浪子L 深度学习深度学习笔记计算机视觉 python 人工智能神经网络 pytorch
#人工智能##深度学习##语义分割##计算机视觉##神经网络#计算机视觉13.11全卷积网络全卷积网络（fullyconvolutionalnetwork，FCN）采用卷积神经网络实现了从图像像素到像素类别的变换。引入l转置卷积（transposedconvolution）实现的，输出的类别预测与输入图像在像素级别上具有一一对应关系：通道维的输出即该位置对应像素的类别预测。13.11.1构造模型下
深度学习-点击率预估-研究论文2024-09-14速读 sp_fyf_2024 深度学习人工智能
深度学习-点击率预估-研究论文2024-09-14速读1.DeepTargetSessionInterestNetworkforClick-ThroughRatePredictionHZhong,JMa,XDuan,SGu,JYao-2024InternationalJointConferenceonNeuralNetworks,2024深度目标会话兴趣网络用于点击率预测摘要：这篇文章提出了一种新
计算机视觉中，Pooling的作用 Wils0nEdwards 计算机视觉人工智能
在计算机视觉中，Pooling（池化）是一种常见的操作，主要用于卷积神经网络（CNN）中。它通过对特征图进行下采样，减少数据的空间维度，同时保留重要的特征信息。Pooling的作用可以归纳为以下几个方面：1.降低计算复杂度与内存需求Pooling操作通过对特征图进行下采样，减少了特征图的空间分辨率（例如，高度和宽度）。这意味着网络需要处理的数据量会减少，从而降低了计算量和内存需求。这对大型神经网络
OpenCV图像处理技术（Python）——入门森屿_ opencv
©FuXianjun.AllRightsReserved.OpenCV入门图像作为人类感知世界的视觉基础，是人类获取信息、表达信息的重要手段，OpenCV作为一个开源的计算机视觉库，它包括几百个易用的图像成像和视觉函数，既可以用于学术研究，也可用于工业邻域，它于1999年由因特尔的GaryBradski启动，OpenCV库主要由C和C++语言编写，它可以在多个操作系统上运行。1.1图像处理基本操作
损失函数与反向传播 Star_. PyTorch pytorch 深度学习 python
损失函数定义与作用损失函数(lossfunction)在深度学习领域是用来计算搭建模型预测的输出值和真实值之间的误差。1.损失函数越小越好2.计算实际输出与目标之间的差距3.为更新输出提供依据（反向传播)常见的损失函数回归常见的损失函数有：均方差（MeanSquaredError，MSE）、平均绝对误差（MeanAbsoluteErrorLoss，MAE）、HuberLoss是一种将MSE与MAE
【深度学习】训练过程中一个OOM的问题，太难查了 weixin_40293999 深度学习深度学习人工智能
现象：各位大佬又遇到过ubuntu的这个问题么？现象是在训练过程中，ssh上不去了，能ping通，没死机，但是ubunutu的pc侧的显示器，鼠标啥都不好用了。只能重启。问题原因：OOM了95G，尼玛！！！！pytorch爆内存了，然后journald假死了，在journald被watchdog干掉之后，系统就崩溃了。这种规模的爆内存一般，即使被oomkill了，也要卡半天的，确实会这样，能不能配
CV、NLP、数据控掘推荐、量化海的那边- AI算法自然语言处理人工智能
下面是对CV（计算机视觉）、NLP（自然语言处理）、数据挖掘推荐和量化的简要概述及其应用领域的介绍：1.CV（计算机视觉，ComputerVision）定义：计算机视觉是一门让计算机能够从图像或视频中提取有用信息，并做出决策的学科。它通过模拟人类的视觉系统来识别、处理和理解视觉信息。主要任务：图像分类：识别图像中的物体并分类，比如猫、狗、车等。目标检测：在图像或视频中定位并识别多个对象，如人脸检测
云服务业界动态简报-20180128 Captain7
一、青云青云QingCloud推出深度学习平台DeepLearningonQingCloud，包含了主流的深度学习框架及数据科学工具包，通过QingCloudAppCenter一键部署交付，可以让算法工程师和数据科学家快速构建深度学习开发环境，将更多的精力放在模型和算法调优。二、腾讯云1.腾讯云正式发布腾讯专有云TCE(TencentCloudEnterprise)矩阵，涵盖企业版、大数据版、AI
机器学习VS深度学习 nfgo 机器学习
机器学习（MachineLearning,ML）和深度学习（DeepLearning,DL）是人工智能（AI）的两个子领域，它们有许多相似之处，但在技术实现和应用范围上也有显著区别。下面从几个方面对两者进行区分：1.概念层面机器学习：是让计算机通过算法从数据中自动学习和改进的技术。它依赖于手动设计的特征和数学模型来进行学习，常用的模型有决策树、支持向量机、线性回归等。深度学习：是机器学习的一个子领
大数据毕业设计hadoop+spark+hive知识图谱租房数据分析可视化大屏租房推荐系统 58同城租房爬虫房源推荐系统房价预测系统计算机毕业设计机器学习深度学习人工智能 2401_84572577 程序员大数据 hadoop 人工智能
做了那么多年开发，自学了很多门编程语言，我很明白学习资源对于学一门新语言的重要性，这些年也收藏了不少的Python干货，对我来说这些东西确实已经用不到了，但对于准备自学Python的人来说，或许它就是一个宝藏，可以给你省去很多的时间和精力。别在网上瞎学了，我最近也做了一些资源的更新，只要你是我的粉丝，这期福利你都可拿走。我先来介绍一下这些东西怎么用，文末抱走。（1）Python所有方向的学习路线（
深度学习-13-小语言模型之SmolLM的使用皮皮冰燃深度学习深度学习
文章附录1SmolLM概述1.1SmolLM简介1.2下载模型2运行2.1在CPU/GPU/多GPU上运行模型2.2使用torch.bfloat162.3通过位和字节的量化版本3应用示例4问题及解决4.1attention_mask和pad_token_id报错4.2max_new_tokens=205参考附录1SmolLM概述1.1SmolLM简介SmolLM是一系列尖端小型语言模型，提供三种规
基于深度学习的农作物病害检测 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的农作物病害检测利用卷积神经网络（CNN）、生成对抗网络（GAN）、Transformer等深度学习技术，自动识别和分类农作物的病害，帮助农业工作者提高作物管理效率、减少损失。1.农作物病害检测的挑战病害种类繁多：农作物病害的类型多样，不同病害在同一作物上的表现差异很大，同时同一种病害在不同生长阶段的症状也可能不同。环境影响：天气、光照、湿度等外部环境因素会影响农作物的表现，使得病害检
基于深度学习的文本引导的图像编辑 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的文本引导的图像编辑（Text-GuidedImageEditing）是一种通过自然语言文本指令对图像进行编辑或修改的技术。它结合了图像生成和自然语言处理（NLP）的最新进展，使用户能够通过描述性文本对图像内容进行精确的调整和操控。1.文本引导的图像编辑的挑战文本和图像之间的对齐：如何将文本中的语义信息准确地映射到图像中的特定区域或元素是一个关键挑战。这涉及到多模态数据的对齐和理解。编
深度学习--对抗生成网络（GAN, Generative Adversarial Network） Ambition_LAO 深度学习生成对抗网络
对抗生成网络（GAN,GenerativeAdversarialNetwork）是一种深度学习模型，由IanGoodfellow等人在2014年提出。GAN主要用于生成数据，通过两个神经网络相互对抗，来生成以假乱真的新数据。以下是对GAN的详细阐述，包括其概念、作用、核心要点、实现过程、代码实现和适用场景。1.概念GAN由两个神经网络组成：生成器（Generator）和判别器（Discrimina
深度学习：怎么看pth文件的参数奥利给少年深度学习人工智能
.pth文件是PyTorch模型的权重文件，它通常包含了训练好的模型的参数。要查看或使用这个文件，你可以按照以下步骤操作：1.确保你有模型的定义你需要有创建这个.pth文件时所用的模型的代码。这意味着你需要有模型的类定义和架构。2.加载模型权重使用PyTorch的load_state_dict方法来加载权重。这里是如何操作的：importtorchimporttorch.nnasnn#定义模型结构
chatgpt赋能python：如何在Python中安装Keras库？ turensu ChatGpt python chatgpt keras 计算机
如何在Python中安装Keras库？Keras是一个简单易用的神经网络库，由FrançoisChollet编写。它在Python编程语言中实现了深度学习的功能，可以使您更轻松地构建和试验不同类型的神经网络。如果您是一名Python开发人员，肯定会想知道如何在您的Python项目中安装Keras库。在本文中，我们将向您展示如何安装和配置Keras库。步骤1：安装Python要使用Keras库，您需
如何理解深度学习的训练过程奋斗的草莓熊深度学习人工智能 python scikit-learn virtualenv numpy pandas
文章目录1.训练是干什么？2.预训练模型进行训练，主要更改的是预训练模型的什么东西？1.训练是干什么？以yolov5为例子，训练的目的是把一组输入猫狗图像放到神经网络中，得到一个输出模型，这个模型下次可以直接用来识别哪个是猫，哪个是狗2.预训练模型进行训练，主要更改的是预训练模型的什么东西？超参数（Hyperparameters）：这是模型结构中定义的参数，比如：卷积核大小（kernel_size
Keras深度学习框架入门及实战指南司莹嫣Maude
Keras深度学习框架入门及实战指南keraskeras-team/keras:是一个基于Python的深度学习库，它没有使用数据库。适合用于深度学习任务的开发和实现，特别是对于需要使用Python深度学习库的场景。特点是深度学习库、Python、无数据库。项目地址:https://gitcode.com/gh_mirrors/ke/keras一、项目介绍Keras简介Keras是一款高级神经网络
深度学习驱动的车牌识别：技术演进与未来挑战逼子歌深度学习车牌识别神经网络字符识别 YOLO 卷积神经网络
一、引言1.1研究背景在当今社会，智能交通系统的发展日益重要，而车牌识别作为其关键组成部分，发挥着至关重要的作用。车牌识别技术广泛应用于交通管理、停车场管理、安防监控等领域。在交通管理中，它可以用于车辆识别、交通违法监控和车流统计等，提高交通管理的效率和准确性。在停车场管理中，实现车辆的自动识别和收费，提升管理和服务水平。在安防监控领域，可用于追踪嫌疑人及犯罪行为。深度学习的出现为车牌识别带来了重
每天五分钟玩转深度学习PyTorch：模型参数优化器torch.optim 幻风_huanfeng 深度学习框架pytorch 深度学习 pytorch 人工智能神经网络机器学习优化算法
本文重点在机器学习或者深度学习中，我们需要通过修改参数使得损失函数最小化(或最大化)，优化算法就是一种调整模型参数更新的策略。在pytorch中定义了优化器optim，我们可以使用它调用封装好的优化算法，然后传递给它神经网络模型参数，就可以对模型进行优化。本文是学习第6步(优化器)，参考链接pytorch的学习路线随机梯度下降算法在深度学习和机器学习中，梯度下降算法是最常用的参数更新方法，它的公式
什么是AIGC？有哪些免费工具？ chent_某位 AIGC
AIGC（AIGeneratedContent），即“人工智能生成内容”，是指通过人工智能技术自动生成各种类型的数字内容。AIGC让机器能够根据输入的信息或数据生成符合人类需求的文本、图像、音频、视频等内容，极大提高了内容创作的效率。AIGC的背景与起源随着深度学习和自然语言处理技术的快速发展，人工智能已经不再局限于简单的任务，如分类、预测和数据分析，而是具备了生成内容的能力。生成式AI模型，如O
transformer架构(Transformer Architecture)原理与代码实战案例讲解 AI架构设计之禅大数据AI人工智能 Python入门实战计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
transformer架构(TransformerArchitecture)原理与代码实战案例讲解关键词：Transformer,自注意力机制,编码器-解码器,预训练,微调,NLP,机器翻译作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来自然语言处理（NLP）领域的发展经历了从规则驱动到统计驱动再到深度学习驱动的三个阶段。
apache ftpserver-CentOS config gengzg apache
<server xmlns="http://mina.apache.org/ftpserver/spring/v1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://mina.apache.o
优化MySQL数据库性能的八种方法 AILIKES sql mysql
1、选取最适用的字段属性　　MySQL可以很好的支持大数据量的存取，但是一般说来，数据库中的表越小，在它上面执行的查询也就会越快。因此，在创建表的时候，为了获得更好的性能，我们可以将表中字段的宽度设得尽可能小。例如，在定义邮政编码这个字段时，如果将其设置为CHAR(255),显然给数据库增加了不必要的空间，甚至使用VARCHAR这种类型也是多余的，因为CHAR(6)就可以很
JeeSite 企业信息化快速开发平台 Kai_Ge JeeSite
JeeSite 企业信息化快速开发平台平台简介 JeeSite是基于多个优秀的开源项目，高度整合封装而成的高效，高性能，强安全性的开源Java EE快速开发平台。 JeeSite本身是以Spring Framework为核心容器，Spring MVC为模型视图控制器，MyBatis为数据访问层， Apache Shiro为权限授权层，Ehcahe对常用数据进行缓存，Activit为工作流
通过Spring Mail Api发送邮件 120153216 邮件 main
原文地址：http://www.open-open.com/lib/view/open1346857871615.html 使用Java Mail API来发送邮件也很容易实现，但是最近公司一个同事封装的邮件API实在让我无法接受，于是便打算改用Spring Mail API来发送邮件，顺便记录下这篇文章。【Spring Mail API】 Spring Mail API都在org.spri
Pysvn 程序员使用指南 2002wmj SVN
源文件:http://ju.outofmemory.cn/entry/35762 这是一篇关于pysvn模块的指南. 完整和详细的API请参考 http://pysvn.tigris.org/docs/pysvn_prog_ref.html. pysvn是操作Subversion版本控制的Python接口模块. 这个API接口可以管理一个工作副本, 查询档案库, 和同步两个. 该
在SQLSERVER中查找被阻塞和正在被阻塞的SQL 357029540 SQL Server
SELECT R.session_id AS BlockedSessionID , S.session_id AS BlockingSessionID , Q1.text AS Block
Intent 常用的用法备忘 7454103 .net android Google Blog F#
Intent 应该算是Android中特有的东西。你可以在Intent中指定程序要执行的动作（比如：view,edit,dial），以及程序执行到该动作时所需要的资料。都指定好后，只要调用startActivity()，Android系统会自动寻找最符合你指定要求的应用程序，并执行该程序。下面列出几种Intent 的用法显示网页:
Spring定时器时间配置 adminjun spring 时间配置定时器
红圈中的值由6个数字组成，中间用空格分隔。第一个数字表示定时任务执行时间的秒，第二个数字表示分钟，第三个数字表示小时，后面三个数字表示日，月，年，< xmlnamespace prefix ="o" ns ="urn:schemas-microsoft-com:office:office" /> 测试的时候，由于是每天定时执行，所以后面三个数
POJ 2421 Constructing Roads 最小生成树 aijuans 最小生成树
来源：http://poj.org/problem?id=2421 题意：还是给你n个点，然后求最小生成树。特殊之处在于有一些点之间已经连上了边。思路：对于已经有边的点，特殊标记一下，加边的时候把这些边的权值赋值为0即可。这样就可以既保证这些边一定存在，又保证了所求的结果正确。代码： #include <iostream> #include <cstdio>
重构笔记——提取方法（Extract Method） ayaoxinchao java 重构提炼函数局部变量提取方法
提取方法（Extract Method）是最常用的重构手法之一。当看到一个方法过长或者方法很难让人理解其意图的时候，这时候就可以用提取方法这种重构手法。下面是我学习这个重构手法的笔记：提取方法看起来好像仅仅是将被提取方法中的一段代码，放到目标方法中。其实，当方法足够复杂的时候，提取方法也会变得复杂。当然，如果提取方法这种重构手法无法进行时，就可能需要选择其他
为UILabel添加点击事件 bewithme UILabel
默认情况下UILabel是不支持点击事件的，网上查了查居然没有一个是完整的答案，现在我提供一个完整的代码。 UILabel *l = [[UILabel alloc] initWithFrame:CGRectMake(60, 0, listV.frame.size.width - 60, listV.frame.size.height)]
NoSQL数据库之Redis数据库管理(PHP-REDIS实例) bijian1013 redis 数据库 NoSQL
一.redis.php <?php //实例化 $redis = new Redis(); //连接服务器 $redis->connect("localhost"); //授权 $redis->auth("lamplijie"); //相关操
SecureCRT使用备注 bingyingao secureCRT 每页行数
SecureCRT日志和卷屏行数设置一、使用securecrt时，设置自动日志记录功能。 1、在C:\Program Files\SecureCRT\下新建一个文件夹(也就是你的CRT可执行文件的路径），命名为Logs； 2、点击Options -> Global Options -> Default Session -> Edite Default Sett
【Scala九】Scala核心三：泛型 bit1129 scala
泛型类 package spark.examples.scala.generics class GenericClass[K, V](val k: K, val v: V) { def print() { println(k + "," + v) } } object GenericClass { def main(args: Arr
素数与音乐 bookjovi 素数数学 haskell
由于一直在看haskell，不可避免的接触到了很多数学知识，其中数论最多，如素数，斐波那契数列等，很多在学生时代无法理解的数学现在似乎也能领悟到那么一点。闲暇之余，从图书馆找了<<The music of primes>>和<<世界数学通史>>读了几遍。其中素数的音乐这本书与软件界熟知的&l
Java-Collections Framework学习与总结-IdentityHashMap BrokenDreams Collections
这篇总结一下java.util.IdentityHashMap。从类名上可以猜到，这个类本质应该还是一个散列表，只是前面有Identity修饰，是一种特殊的HashMap。简单的说，IdentityHashMap和HashM
读《研磨设计模式》-代码笔记-享元模式-Flyweight bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.util.ArrayList; import java.util.Collection; import java.util.HashMap; import java.util.List; import java
PS人像润饰&调色教程集锦 cherishLC PS
1、仿制图章沿轮廓润饰——柔化图像，凸显轮廓 http://www.howzhi.com/course/retouching/ 新建一个透明图层，使用仿制图章不断Alt+鼠标左键选点，设置透明度为21%，大小为修饰区域的1/3左右（比如胳膊宽度的1/3），再沿纹理方向（比如胳膊方向）进行修饰。所有修饰完成后，对该润饰图层添加噪声，噪声大小应该和
更新多个字段的UPDATE语句 crabdave update
更新多个字段的UPDATE语句 update tableA a set (a.v1, a.v2, a.v3, a.v4) = --使用括号确定更新的字段范围
hive实例讲解实现in和not in子句 daizj hive not in in
本文转自：http://www.cnblogs.com/ggjucheng/archive/2013/01/03/2842855.html 当前hive不支持 in或not in 中包含查询子句的语法，所以只能通过left join实现。假设有一个登陆表login(当天登陆记录,只有一个uid),和一个用户注册表regusers(当天注册用户，字段只有一个uid)，这两个表都包含
一道24点的10+种非人类解法（2,3,10,10） dsjt 算法
这是人类算24点的方法？！！！事件缘由：今天晚上突然看到一条24点状态，当时惊为天人，这NM叫人啊？以下是那条状态朱明西 : 24点，算2 3 10 10，我LX炮狗等面对四张牌痛不欲生，结果跑跑同学扫了一眼说，算出来了，2的10次方减10的3次方。。我草这是人类的算24点啊。。然后么。。。我就在深夜很得瑟的问室友求室友算刚出完题，文哥的暴走之旅开始了 5秒后
关于YII的菜单插件 CMenu和面包末breadcrumbs路径管理插件的一些使用问题 dcj3sjt126com yii framework
在使用 YIi的路径管理工具时，发现了一个问题。 <?php
对象与关系之间的矛盾：“阻抗失配”效应[转] come_for_dream 对象
概述 “阻抗失配”这一词组通常用来描述面向对象应用向传统的关系数据库（RDBMS）存放数据时所遇到的数据表述不一致问题。C++程序员已经被这个问题困扰了好多年，而现在的Java程序员和其它面向对象开发人员也对这个问题深感头痛。 “阻抗失配”产生的原因是因为对象模型与关系模型之间缺乏固有的亲合力。“阻抗失配”所带来的问题包括：类的层次关系必须绑定为关系模式（将对象
学习编程那点事 gcq511120594 编程互联网
一年前的夏天，我还在纠结要不要改行，要不要去学php？能学到真本事吗？改行能成功吗？太多的问题，我终于不顾一切，下定决心，辞去了工作，来到传说中的帝都。老师给的乘车方式还算有效，很顺利的就到了学校，赶巧了，正好学校搬到了新校区。先安顿了下来，过了个轻松的周末，第一次到帝都，逛逛吧！接下来的周一，是我噩梦的开始，学习内容对我这个零基础的人来说，除了勉强完成老师布置的作业外，我已经没有时间和精力去
Reverse Linked List II hcx2013 list
Reverse a linked list from position m to n. Do it in-place and in one-pass. For example:Given 1->2->3->4->5->NULL, m = 2 and n = 4, return
Spring4.1新特性——页面自动化测试框架Spring MVC Test HtmlUnit简介 jinnianshilongnian spring 4.1
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
Hadoop集群工具distcp liyonghui160com
1. 环境描述两个集群：rock 和 stone rock无kerberos权限认证，stone有要求认证。 1. 从rock复制到stone，采用hdfs Hadoop distcp -i hdfs://rock-nn:8020/user/cxz/input hdfs://stone-nn:8020/user/cxz/运行在rock端，即源端问题：报版本
一个备份MySQL数据库的简单Shell脚本 pda158 mysql 脚本
　　主脚本（用于备份mysql数据库）：　　该Shell脚本可以自动备份数据库。只要复制粘贴本脚本到文本编辑器中，输入数据库用户名、密码以及数据库名即可。我备份数据库使用的是mysqlump 命令。后面会对每行脚本命令进行说明。　　 1. 分别建立目录“backup”和“oldbackup” 　　#mkdir /backup 　　#mkdir /oldbackup 　
300个涵盖IT各方面的免费资源（中）——设计与编码篇 shoothao IT资源图标库图片库色彩板字体
A. 免费的设计资源 Freebbble:来自于Dribbble的免费的高质量作品。 Dribbble:Dribbble上“免费”的搜索结果——这是巨大的宝藏。 Graphic Burger:每个像素点都做得很细的绝佳的设计资源。 Pixel Buddha:免费和优质资源的专业社区。 Premium Pixels:为那些有创意的人提供免费的素材。
thrift总结 - 跨语言服务开发 uule thrift
官网官网JAVA例子 thrift入门介绍 IBM-Apache Thrift - 可伸缩的跨语言服务开发框架 Thrift入门及Java实例演示 thrift的使用介绍 RPC POM： <dependency> <groupId>org.apache.thrift</groupId>