南京公安脸部识别锁定
虚拟助手项目 (Virtual Assistant Project)
Security is one of the biggest concerns of modern-day. Making sure only the right person is granted access to a device becomes extremely important. This is one of the main reasons our smartphones often times than not have a 2-step security system. This is to ensure our privacy is maintained and only the authentic owner can access their device. Face Recognition based smart face lock technology is one such security measure and we will look into how we can build our very own face recognition system which performs with maximal accuracy from scratch using deep learning and transfer learning with VGG-16.
小号 ecurity是现代最大的问题之一。 确保只有合适的人才可以访问设备,这一点变得非常重要。 这是我们的智能手机经常不具备两步式安全系统的主要原因之一。 这是为了确保我们的隐私得到维护,只有真实的所有者才能访问其设备。 基于面部识别的智能面部锁定技术就是其中一种安全措施,我们将研究如何构建自己的面部识别系统,该系统使用深度学习和VGG-16转移学习从头开始以最高的准确性执行。
Note: This is part-1 of the virtual assistant series. There will be more upcoming parts on the same topic where we will cover how you can build your very own virtual assistant using deep learning technologies and python.
注意: 这是虚拟助手系列的第1部分。 在同一主题上还将有更多即将发布的部分,我们将介绍如何使用深度学习技术和python构建自己的虚拟助手。
介绍: (Introduction:)
This section will cover what the model built will exactly perform. The face recognition model we are building will be able to detect the authorized owner’s face and will reject any other face. The model will provide a vocal response if the face is granted access or the access is denied. The user will have 3 tries to verify the same. On the failure of the third attempt, the entire system will shut down, thus maintaining security. If the correct face is recognized, then the access is granted and the user can proceed to control the device. The entire code will be provided at the end of the article with a link to the GitHub repository.
本节将介绍构建的模型将完全执行的操作。 我们正在构建的面部识别模型将能够检测到授权所有者的面部,并将拒绝其他任何面部。 如果面部被授予访问权限或访问被拒绝,则模型将提供声音响应。 用户将进行3次尝试验证。 第三次尝试失败时,整个系统将关闭,从而保持安全性。 如果识别出正确的面部,则将授予访问权限,并且用户可以继续控制设备。 整个代码将在本文结尾处提供,并提供指向GitHub存储库的链接。
方法: (Approach:)
Firstly, we will look into how we can collect the images of the legal owner for which the face recognition model will grant permission. Then, we will create an additional folder if we want to add more people who can have access to our system. Our next step will be to resize the images into a (224, 224, 3) shape so that we can pass it through the VGG-16 architecture. Note that VGG-16 architecture is pre-trained on the image net weights which have the aforementioned shape. Then we will create variations of the images by performing the image data augmentation on the datasets. After this, we are free to create our custom model on top of the VGG-16 architecture by excluding the top layer. This is followed by compilation, training, and fitting the models accordingly with the essential callbacks. Finally, we will conclude with a final model that can load the weights of the model and perform the face recognition based smart face lock.
首先,我们将研究如何收集面部识别模型将为其授予许可的合法所有者的图像。 然后,如果我们想添加更多可以访问我们系统的人员,则将创建一个附加文件夹。 我们的下一步是将图像调整为(224、224、3)形状,以便我们可以将其传递给VGG-16体系结构。 注意,VGG-16体系结构是在具有上述形状的图像净重上进行预训练的。 然后,我们将通过对数据集执行图像数据扩充来创建图像的变体。 此后,我们可以通过排除顶层,自由地在VGG-16架构之上创建自定义模型。 然后进行编译,训练,并使用必要的回调对模型进行相应的拟合。 最后,我们将得出一个最终模型,该模型可以加载模型的权重并执行基于面部识别的智能面部锁定。
Note: In this article, I have performed the entire task with only a train directory. You are free to split the program into train and validation directories. I have also performed the same with a train and validation directory and I will include a plot of the graph at the end.
注意: 在本文中,我仅用一个火车目录执行了整个任务。 您可以自由地将程序划分为训练目录和验证目录。 我还在火车和验证目录中执行了相同的操作,最后我将在图表中添加一个图。
图片集: (Collection of Images:)
In this step, we will be writing a simple python code to collect the images with the click of the space bar button. We can click the ‘q’ button to exit the graphical window. The collection of images is an important step and we will be collecting the images for only the people you want to grant access to your selective device. Let us look at the code that we will be using to perform the following actions.
在此步骤中,我们将编写一个简单的python代码,通过单击空格键来收集图像。 我们可以单击“ q”按钮退出图形窗口。 图像收集是重要的一步,我们将仅为您要授予对您的选择性设备访问权限的人员收集图像。 让我们看一下将用于执行以下操作的代码。
import cv2
import os
capture = cv2.VideoCapture(0)
directory = "Bharath/"
path = os.listdir(directory)
count = 0
We are importing the required libraries and initializing the variables accordingly in the above code block.
我们正在上面的代码块中导入所需的库并相应地初始化变量。
Importing the opencv module for computer vision and capturing images.
导入opencv模块以实现计算机视觉和捕获图像。
Importing os module to access the local system.
导入操作系统模块以访问本地系统。
We are going to “on” our default webcam and then proceed to capture the images of our faces which is required for the dataset. This is done by the VideoCapture command. We will then create a path to our specific directory and initialize count to 0. This count variable will be used to label our images from 0 to the total number of photos we click.
我们将“打开”默认的网络摄像头,然后继续捕获数据集所需的面部图像。 这是通过VideoCapture命令完成的。 然后,我们将创建到特定目录的路径,并将count初始化为0。此count变量将用于将我们的图像从0标记为单击的照片总数。
Note: I had already created my folder, if you want to make your directory directly through the program you can use the os.mkdir command or create a folder in the usual manner. It is also important to create another folder that can consist of other images which are to not be granted access.
注意: 我已经创建了文件夹,如果要直接通过程序创建目录,则可以使用os.mkdir命令或以通常的方式创建文件夹。 同样重要的是创建另一个文件夹,该文件夹可以包含其他不应授予访问权限的图像。
Finally, we can look at the code required to perform the entire process of collection of images.
最后,我们可以看一下执行图像收集整个过程所需的代码。
while True:
ret, frame = capture.read()
cv2.imshow('Frame', frame)
key = cv2.waitKey(1)
if key%256 == 32:
img_path = directory + str(count) + ".jpeg"
cv2.imwrite(img_path, frame)
count += 1
elif key%256 == 113:
break
capture.release()
cv2.destroyAllWindows()
We make sure the code runs only when the webcam is captured and activated. We will then capture the video and return the frame. Then we will assign the variable “key” to take a button pressed command. This key press gives us two options. We will refer to the ASCII table for this —
我们确保仅在捕获并激活网络摄像头后代码才能运行。 然后,我们将捕获视频并返回帧。 然后,我们将给变量“ key”分配一个按下按钮的命令。 此按键为我们提供了两种选择。 我们将为此参考ASCII表-
Click here to refer to the ASCII table for the particular wait keys. Let us now look at our two options for the image capture function.
单击此处以参考ASCII表中的特定等待键。 现在让我们来看一下图像捕获功能的两个选项。
- Click a picture when we press the space button on the keyboard. 当我们按下键盘上的空格键时,单击图片。
- Quit the program when ‘q’ is pressed 按下“ q”时退出程序
After we exit the program we will release the video capture from our webcam and destroy the cv2 graphical window.
退出程序后,我们将释放来自网络摄像头的视频,并销毁cv2图形窗口。
调整图像大小: (Resizing the Images:)
In the next code block, we will resize the images accordingly. We want to reshape our images we collected into a size that is suitable to pass through the VGG-16 architecture which is pre-trained of the imagenet weights. Let us look at the code for performing this task.
在下一个代码块中,我们将相应地调整图像的大小。 我们想将我们收集的图像重塑为适合通过VGG-16体系结构的大小,该体系结构已预先训练了imagenet权重。 让我们看一下执行此任务的代码。
import cv2
import os
directory = "Bharath/"
path = os.listdir(directory)
for i in path:
img_path = directory + i
image = cv2.imread(img_path)
image = cv2.resize(image, (224, 224))
cv2.imwrite(img_path, image)
Importing the opencv module for computer vision and capturing images.
导入opencv模块以实现计算机视觉和捕获图像。
Importing os module to access the local system.
导入操作系统模块以访问本地系统。
We are rescaling all our images captured from the default frame size to (224, 224) pixels because that is what’s best if we want to try out transfer learning models like VGG-16. We have already captured the images in an RGB format. Thus we already have 3 channels and we do not need to specify that. The required number of channels for a VGG-16 is 3 and the architecture is ideally of shape (224, 224, 3).
我们正在将所有捕获的图像从默认的帧大小调整为(224,224)像素,因为如果要尝试使用像VGG-16这样的转移学习模型,那是最好的选择。 我们已经以RGB格式捕获了图像。 因此,我们已经有3个通道,我们不需要指定它。 VGG-16所需的通道数为3,理想情况下,该架构的形状为(224、224、3)。
After the resizing step is completed, we can transfer the Owner’s directory into the images folder.
调整大小步骤完成后,我们可以将所有者的目录转移到images文件夹中。
Note: If you are trying to create separate train and validation datasets, then split the images in an 80:20 format and place it in your respective directories accordingly.
注意: 如果您尝试创建单独的训练和验证数据集,请以80:20格式分割图像,并将其相应地放置在各自的目录中。
图像数据增强: (Image Data Augmentation:)
We have collected and reshaped our images. The next step is to perform the image data augmentation on the dataset to replicate copies and increase the size of the dataset. We can do this with the following code block.
我们已经收集并重塑了我们的图像。 下一步是对数据集执行图像数据扩充,以复制副本并增加数据集的大小。 我们可以使用下面的代码块来做到这一点。
train_datagen = ImageDataGenerator(rescale=1./255,
rotation_range=30,
shear_range=0.3,
zoom_range=0.3,
width_shift_range=0.4,
height_shift_range=0.4,
horizontal_flip=True,
fill_mode='nearest')
train_generator = train_datagen.flow_from_directory(directory,
target_size=(Img_Height, Img_width),
batch_size=batch_size,
class_mode='categorical',
shuffle=True)
The ImageDataGenerator is used for data augmentation of images.We will be replicating and making copies of the transformations of theoriginal images. The Keras Data Generator will use the copies andnot the original ones. This will be useful for training at each epoch.
ImageDataGenerator用于图像的数据扩充。我们将复制原始图像并对其进行复制。 Keras数据生成器将使用副本而不是原始副本。 这对于每个时期的训练都是有用的。
We will be rescaling the image and updating all the parameters to suit our model. The parameters are as follows:1. rescale = Rescaling by 1./255 to normalize each of the pixel values2. rotation_range = specifies the random range of rotation3. shear_range = Specifies the intensity of each angle in the counter-clockwise range.4. zoom_range = Specifies the zoom range.5. width_shift_range = specify the width of the extension.6. height_shift_range = Specify the height of the extension.7. horizontal_flip = Flip the images horizontally.8. fill_mode = Fill according to the closest boundaries.
我们将重新缩放图像并更新所有参数以适合我们的模型。 主要参数如下:1。 重新调整 =重标度由1./255归一化每个像素values2的。 rotation_range =指定旋转的随机范围3。 shear_range =指定逆时针范围内每个角度的强度4。 zoom_range =指定缩放范围5。 width_shift_range =指定扩展名的宽度6。 height_shift_range =指定扩展名的高度7。 horizontal_flip = 水平翻转图像8。 fill_mode =根据最接近的边界填充。
train_datagen.flow_from_directory Takes the path to a directory & generates batches of augmented data. The callable properties are as follows:1. train dir = Specifies the directory where we have stored the image data.2. color_mode = Important feature which we need to specify how our images are categorized i.e. grayscale or RGB format. The default is RGB.3. target_size = The Dimensions of the image.4. batch_size = The number of batches of data for the flow operation.5. class_mode = Determines the type of label arrays that are returned.“categorical” will be 2D one-hot encoded labels.6. shuffle = shuffle: Whether to shuffle the data (default: True)If set to False, sorts the data in alphanumeric order.
train_datagen.flow_from_directory取得目录的路径并生成批次的扩充数据。 可调用属性如下:1。 train dir =指定我们存储图像数据的目录2。 color_mode =重要功能,我们需要指定图像的分类方式,即灰度或RGB格式。 默认值为RGB.3。 target_size =图片的尺寸4。 batch_size =流操作的数据批数5。 class_mode =确定返回的标签数组的类型。“ categorical”将是二维一键编码的标签。6。 shuffle = shuffle:是否随机播放数据(默认值:True)如果设置为False,则按字母数字顺序对数据进行排序。
建立模型: (Building The Model:)
In the next code block, we are importing the VGG-16 Model in the variable VGG16_MODEL and making sure we input the model without the top layer.Using the VGG-16 architecture without the top layer, we can now add our custom layers. To Avoid training VGG-16 Layers we give the command below:layers.trainable = False.
在下一个代码块中,我们将VGG-16模型导入变量VGG16_MODEL中,并确保输入的模型不包含顶层。使用不包含顶层的VGG-16体系结构,现在可以添加自定义层。 为了避免训练VGG-16图层,我们提供以下命令: layers.trainable = False。
We will also print out these layers and make sure their training is set as False.
我们还将打印出这些图层,并确保将它们的训练设置为False。
VGG16_MODEL = VGG16(input_shape=(Img_width, Img_Height, 3), include_top=False, weights='imagenet')
for layers in VGG16_MODEL.layers:
layers.trainable=False
for layers in VGG16_MODEL.layers:
print(layers.trainable)
Let us now proceed to build our custom model on top of the VGG-16 architecture. This will be similar to the fingers gesture model in one of my other articles which you can check out here.
现在让我们继续在VGG-16架构之上构建我们的自定义模型。 这将与我的其他文章之一中的手指手势模型相似,您可以在此处查看 。
# Input layer
input_layer = VGG16_MODEL.output
# Convolutional Layer
Conv1 = Conv2D(filters=32, kernel_size=(3,3), strides=(1,1), padding='valid',
data_format='channels_last', activation='relu',
kernel_initializer=keras.initializers.he_normal(seed=0),
name='Conv1')(input_layer)
# MaxPool Layer
Pool1 = MaxPool2D(pool_size=(2,2),strides=(2,2),padding='valid',
data_format='channels_last',name='Pool1')(Conv1)
# Flatten
flatten = Flatten(data_format='channels_last',name='Flatten')(Pool1)
# Fully Connected layer-1
FC1 = Dense(units=30, activation='relu',
kernel_initializer=keras.initializers.glorot_normal(seed=32),
name='FC1')(flatten)
# Fully Connected layer-2
FC2 = Dense(units=30, activation='relu',
kernel_initializer=keras.initializers.glorot_normal(seed=33),
name='FC2')(FC1)
# Output layer
Out = Dense(units=num_classes, activation='softmax',
kernel_initializer=keras.initializers.glorot_normal(seed=3),
name='Output')(FC2)
model1 = Model(inputs=VGG16_MODEL.input,outputs=Out)
The Face Recognition Model we are building will be trained by usingtransfer learning. We will be using the VGG-16 model with no top layer.We will be adding custom layers to the top layer of the VGG-16 modeland then we will use this transfer learning model for prediction if it is the authorized owner’s face or not.The Custom layer consists of the input layer which is, basically theoutput of the VGG-16 Model. We add a convolutional layer with 32 filters,kernel_size of (3,3), and default strides of (1,1) and we use activationas relu with he_normal as the initializer.We will be using the pooling layer to downsampled the layers from theconvolutional layer.The 2 fully connected layers are used with activation as relu i.e. aDense architecture after the sample is being passed through a flattenlayer.The output layer has a softmax activation with num_classes is 2 thatpredicts the probabilities for the num_classes namely the authorized owner or an additional participant or a rejected face.The final Model takes the input as the start of the VGG-16 modeland outputs as the final output layer.
我们正在建立的人脸识别模型将通过转移学习进行训练。 我们将使用没有顶层的VGG-16模型,将在VGG-16模型的顶层添加自定义层,然后将使用此转移学习模型进行预测,无论它是否是授权所有者的面Kong。 Custom层由输入层组成,该输入层基本上是VGG-16模型的输出。 我们添加了一个包含32个滤镜的卷积层,kernel_size为(3,3),默认跨度为(1,1),我们使用Activationas relu和he_normal作为初始值设定项。我们将使用池化层对卷积层进行降采样将2个完全连接的层用作激活,作为示例,即样本通过平展层后作为密集结构。输出层的softmax激活的num_classes为2,可预测num_classs的概率,即授权所有者或其他最终模型将输入作为VGG-16模型的开始,并将输出作为最终输出层。
回调: (Callbacks:)
In the next code block, we will be looking at the required callbacks for our face recognition task.
在下一个代码块中,我们将研究面部识别任务所需的回调。
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.callbacks import TensorBoard
checkpoint = ModelCheckpoint("face_rec.h5", monitor='accuracy', verbose=1,
save_best_only=True, mode='auto', period=1)
reduce = ReduceLROnPlateau(monitor='accuracy', factor=0.2, patience=3, min_lr=0.00001, verbose = 1)
logdir='logsface'
tensorboard_Visualization = TensorBoard(log_dir=logdir, histogram_freq=True)
We will be importing the 3 required callbacks for training our model. The 3 important callbacks are ModelCheckpoint, ReduceLROnPlateau, and Tensorboard. Let us look at what task each of these individual callbacks performs.
我们将导入3个必需的回调以训练模型。 3个重要的回调是ModelCheckpoint,ReduceLROnPlateau和Tensorboard。 让我们看看这些单独的回调分别执行什么任务。
ModelCheckpoint — This callback is used for storing the weights of our model after training. We save only the best weights of our model by specifying save_best_only=True. We will monitor our training by using the accuracy metric.
ModelCheckpoint —此回调用于训练后存储模型的权重。 通过指定save_best_only = True,我们仅保存模型的最佳权重。 我们将使用准确性指标来监控我们的培训。
ReduceLROnPlateau — This callback is used for reducing the learning rate of the optimizer after a specified number of epochs. Here, we have specified the patience as 10. If the accuracy does not improve after 10 epochs, then our learning rate is reduced accordingly by a factor of 0.2. The metric used for monitoring here is accuracy as well.
ReduceLROnPlateau-此回调用于在指定的时期数之后降低优化器的学习率。 在这里,我们将耐心性指定为10。如果在10个周期后精度没有提高,那么我们的学习率将相应降低0.2倍。 此处用于监视的度量也是准确性。
Tensorboard — The tensorboard callback is used for plotting the visualization of the graphs, namely the graph plots for accuracy and the loss.
Tensorboard — tensorboard回调用于绘制图形的可视化,即准确性和损失的图形图。
编译并拟合模型: (Compile and fit the model:)
model1.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.001),
metrics=['accuracy'])
epochs = 20
model1.fit(train_generator,
epochs = epochs,
callbacks = [checkpoint, reduce, tensorboard_Visualization])
We are compiling and fitting our model in the final step. Here, we are training the model and saving the best weights to face_rec.h5 so that we don’t have to re-train the model repeatedly and we can use our saved model when required. Here I have trained only on the training data. However, you can choose to train with both train and validation data. The loss we have used is categorical_crossentropy which computes the cross-entropy loss between the labels and predictions. The optimizer we will be using is Adam with a learning rate of 0.001 and we will compile our model on the metric accuracy. We will fit the data on the augmented training images. After the fitting step, these are the results we are able to achieve on train loss and accuracy.
我们将在最后一步中编译和拟合模型。 在这里,我们正在训练模型,并将最佳权重保存到face_rec.h5,这样我们就不必重复训练模型,可以在需要时使用保存的模型。 在这里,我仅根据训练数据进行了训练。 但是,您可以选择同时训练和验证数据进行训练。 我们使用的损失是categorical_crossentropy,它计算标签和预测之间的交叉熵损失。 我们将使用的优化器是Adam,学习率为0.001,我们将根据度量精度来编译我们的模型。 我们将数据拟合到增强训练图像上。 在拟合步骤之后,这些是我们能够在火车损耗和准确性上实现的结果。
图表: (Graphs:)
火车数据图— (Graph of train data —)
训练图和验证数据图— (Graph of train and validation data —)
观察: (Observation:)
We are able to develop a transfer learning based VGG-16 Face Architecture which is giving us an extremely high accuracy and a very reduced loss. Overall the model which we have developed is highly effective and works very well. With the help of the graphs, we can observe that just the train graph, as well as the graph with both train and validation plots, are performing well.
我们能够开发基于迁移学习的VGG-16人脸架构,从而为我们提供了极高的准确性和非常低的损耗。 总体而言,我们开发的模型非常有效且效果很好。 借助这些图,我们可以观察到仅训练图以及具有训练图和验证图的图都表现良好。
This concludes our face recognition model. If you want to run and test the model for yourself, then the code will be provided in the GitHub repository. You can check out the link to the entire code as well as the code to loading the weights and final run model here. This will require an additional haarcascade_frontalface_default.xml file which is used for detecting faces. Even this will be provided in the GitHub link. I hope all of you enjoyed reading this article. This is part-1 of the virtual assistant project. Stay tuned for the upcoming parts on the virtual assistant project and I wish all of you a wonderful day!
至此,我们的人脸识别模型结束了。 如果您想自己运行和测试模型,那么代码将在GitHub存储库中提供。 您可以在此处查看指向整个代码的链接以及加载权重和最终运行模型的代码。 这将需要一个额外的haarcascade_frontalface_default.xml文件,该文件用于检测人脸。 甚至将在GitHub链接中提供。 我希望大家都喜欢阅读本文。 这是虚拟助手项目的第1部分。 请继续关注虚拟助手项目的后续部分,祝大家愉快!
翻译自: https://towardsdatascience.com/smart-face-lock-system-6c5a77aa5d30
南京公安脸部识别锁定