卷积网络和卷积神经网络_掌握卷积神经网络所需了解的一切

卷积网络和卷积神经网络

by Tirmidzi Faizal Aflahi

通过提尔米兹·法扎尔·阿弗拉希

Look at the photo below:

看下面的照片:

That is not a real photo. You can open the image in a new tab and zoom into the image. Do you see the mosaics?

那不是真实的照片 。 您可以在新选项卡中打开图像并放大图像。 你看到马赛克了吗?

The picture is actually generated by a program called Artificial Intelligence. Doesn’t it feel realistic? It’s great, isn’t it?

图片实际上是由称为人工智能的程序生成的。 感觉不现实吗? 很好,不是吗?

It has been only 7 years since the technology was brought to the public by Alex Krizhevsky and friends via the ImageNet competition. This competition is an annual Computer Vision competition to categorize pictures into 1000 different classes. From Alaskan Malamutes to toilet paper. Alex and friends built something called AlexNet, and it won the competition with a large margin between it and second place.

自从Alex Krizhevsky和朋友通过ImageNet竞赛将这项技术发布给公众以来,已经只有7年了。 该竞赛是一年一度的计算机视觉竞赛,它将图片分类为1000个不同的类别。 从阿拉斯加雪橇犬到厕纸。 Alex和朋友建立了一个叫做AlexNet的东西,它在比赛和第二名之间获得了很大的优势。

This technology is called a Convolutional Neural Network. It’s a sub-branch of Deep Neural Networks which performs exceptionally well in processing images.

这项技术称为卷积神经网络 。 它是深度神经网络的子分支,在处理图像方面表现出色。

The image above is the error rate produced by the software that won the competition several years back. In 2016, it is actually better than human performance which was around 5%.

上图是几年前赢得竞争的软件所产生的错误率。 在2016年, 它实际上要好于 5%左右的人类绩效

The introduction of Deep Learning into this field is actually game breaking more than game-changing.

将深度学习引入该领域实际上不仅打破游戏规则,更是打破游戏规则。

卷积神经网络架构 (Convolutional Neural Network Architecture)

So, how does this technology work?

那么,这项技术如何运作?

Convolutional Neural Networks perform better than other Deep Neural Network architectures because of their unique process. Instead of looking at the image one pixel at a time, CNNs group several pixels together (an example 3×3 pixel like in the image above) so they can understand a temporal pattern.

卷积神经网络由于其独特的过程而比其他深度神经网络体系结构具有更好的性能。 CNN无需一次查看一个像素的图像,而是将几个像素组合在一起 (例如上图中的3×3像素示例),因此他们可以理解时间模式。

In another way, CNNs can “see” group of pixels forming a line or curve. Because of the deep nature of Deep Neural Networks, in the next level they will see not the group of pixels, but groups of lines and curves forming some shapes. And so on until they form a complete picture.

以另一种方式,CNN可以“看到”形成线或曲线的像素组。 由于深度神经网络的深层性质,在下一级别,他们将看不到像素组,而是看到形成某种形状的线和曲线组。 依此类推,直到它们形成完整的图片为止。

There are many things you need to learn if you want to understand CNNs, from the very basic things, like a kernel, pooling layers, and so on. But nowadays, you can just dive and use many open source projects for this technology.

如果要了解CNN,需要从很多基本的知识中学习很多东西,例如内核,池化层等。 但是如今, 您可以潜水并使用许多开源项目来使用该技术。

This is actually true because of the technology called Transfer Learning.

实际上,由于有称为“ 转移学习 ”的技术,因此确实如此。

转移学习 (Transfer Learning)

Transfer Learning is a technique which reuses the finished Deep Learning model in another more specific task.

转移学习是一种在另一个更具体的任务中重用完成的深度学习模型的技术。

As an example, say you are working in a Train Management company, and want to assess whether your trains are on time or not. And you don’t want to add another workforce just for this task.

例如,假设您在一家火车管理公司中工作,并且想要评估您的火车是否准时。 而且您不希望仅为此任务添加其他劳动力。

You can just reuse an ImageNet Convolutional Neural Network model, maybe ResNet (the 2015 winner) and re-train the network with the images of your train fleets. And you will do just fine.

您可以重用ImageNet卷积神经网络模型(也许是ResNet(2015年获奖者)),并使用火车车队的图像重新训练网络。 而且您会做的很好。

There are two main competitive edges when you use Transfer Learning.

使用转移学习时,有两个主要竞争优势。

  1. Needs fewer images to perform well than training from scratch. ImageNet competition has around 1 million images to train with. Using transfer learning, you can use only 1000 or even 100 images and perform well because it is already trained with those 1 million images.

    比起从头开始训练,只需要更少的图像即可表现良好 。 ImageNet竞赛有大约100万张图像需要训练。 使用转移学习,您只能使用1000张甚至100张图像,并且效果很好,因为已经用这100万张图像对其进行了训练。

  2. Needs less time to achieve good performance. To be as good as ImageNet, you will need to train the network for days, and that doesn’t count the time needed to alter the network if it doesn’t work well. Using transfer learning, you will only need several hours or even minutes to finish training for some tasks. A lot of time saved.

    需要更少的时间来获得良好的性能 。 要与ImageNet一样好,您将需要对网络进行数天的培训,并且如果网络运行不正常,这还不包括更改网络所需的时间。 使用转移学习,您只需要几个小时甚至几分钟就可以完成某些任务的培训。 节省了大量时间。

图像分类到图像生成 (Image Classification to Image Generation)

Enabled with transfer learning, many initiatives appeared. If you can process some images and tell us about what the images are all about, how about constructing the image itself?

随着迁移学习的发展,出现了许多倡议。 如果您可以处理一些图像并告诉我们有关图像的全部内容,那么如何构造图像本身呢?

Challenge accepted!

接受挑战!

Generative Adversarial Network comes to the scene.

生成对抗网络来了。

This technology can generate pictures using some inputs.

该技术可以使用一些输入来生成图片。

It can generate a realistic photo given a painting in a type called CycleGAN which I give you in the photo above. In another use case, it also can generate a picture of a bag given some sketches. It can even generate a higher resolution photo given a low-res photo.

给定名为CycleGAN类型的绘画,它可以生成逼真的照片,我在上面的照片中为您提供了该照片。 在另一个用例中,给定一些草图,它也可以生成袋子的图片。 给定低分辨率照片,它甚至可以生成更高分辨率的照片。

Amazing, aren’t they?

太神奇了,不是吗?

Of course. And you can start learning to build them now. But how?

当然。 您现在就可以开始学习构建它们。 但是如何?

卷积神经网络教程 (Convolutional Neural Network Tutorial)

So, let’s begin. You will learn that getting started on this topic is easy, so freaking easy. But mastering it is on another level.

所以,让我们开始吧。 您将学到,开始这个主题很容易,所以非常容易。 但是掌握它又是另一个层面。

Let’s put aside mastering it for now.

让我们暂时不掌握它。

After browsing for several days, I found this project which is really suitable for you to start with.

浏览几天后,我发现此项目非常适合您开始。

空中仙人掌鉴定 (Aerial Cactus Identification)

This is a tutorial project from Kaggle. Your task is to identify is there any columnar cactus in an aerial image

这是来自Kaggle的教程项目。 您的任务是识别航空图像中是否有任何柱状仙人掌

Pretty simple, eh?

很简单,是吗?

You will be given 17,500 images to work with and need to label 4,000 images that have not been labeled. Your score is 1 or 100% if all the 4,000 images are correctly labeled by your program.

您将获得17,500张可使用的图像,并且需要标记4,000张未标记的图像。 如果您的程序正确标记了所有4,000张图像,则您的分数是1或100%。

The images are pretty much like what you see above. A photo of a region that may or may not contains a group of columnar cacti. The photos are 32×32 pixels. And it shows cacti in different directions since they are aerial photos.

这些图像与您在上方看到的图像非常相似。 包含或不包含一组柱状仙人掌的区域的照片。 照片为32×32像素。 由于它们是航拍照片,因此可以显示不同方向的仙人掌。

So what do you need?

那你需要什么?

使用Python的卷积神经网络 (Convolutional Neural Network with Python)

Yes, Python, the popular language for Deep Learning. With many choices available, you can practically do trial and error for each choice. The choices are:

是的,Python是深度学习的流行语言。 有了许多可用的选择,您实际上可以对每个选择进行反复试验。 选择是:

  1. Tensorflow, the most popular Deep Learning library. Built by engineers at Google and has the biggest contributor base and most fans. Because the community is so big, you can easily find the solution to your problem. It has Keras as the high-level abstraction wrapper, that is so favorable for a newbie.

    Tensorflow ,最受欢迎的深度学习库。 由Google的工程师打造,拥有最大的贡献者群体和大多数粉丝。 由于社区很大,因此您可以轻松找到问题的解决方案。 它具有Keras作为高级抽象包装器,对于新手来说非常有利。

  2. Pytorch. My favorite Deep Learning library. Built purely on Python and following the pros and cons of Python. Python developers will be extremely familiar with this library. It has another library called FastAI which gives the abstraction Keras has for Tensorflow.

    火炬 。 我最喜欢的深度学习库。 完全基于Python构建,并遵循Python的优缺点。 Python开发人员将对该库非常熟悉。 它还有另一个名为FastAI的库,该库为KerasTensorflow提供了抽象。

  3. MXNet. The Deep Learning library by Apache.

    MXNet 。 Apache的深度学习库。

  4. Theano. Predecessor of Tensorflow

    Theano 。 Tensorflow的前身

  5. CNTK. Microsoft also has his own Deep Learning library.

    CNTK 。 微软还拥有自己的深度学习库。

For this tutorial, let’s use my favorite one, Pytorch, complemented by its abstraction, FastAI.

在本教程中,让我们使用我最喜欢的Pytorch,并辅之以它的抽象FastAI。

Before starting, you need to install Python. Go to the Python website and download what you need. You need to make sure that you install version 3.6+, or it may not be supported by the libraries you will use.

开始之前,您需要安装Python。 转到Python网站并下载所需内容。 您需要确保安装版本3.6+ ,否则将要使用的库不支持该版本

Now, open your command line or terminal and install these things

现在,打开命令行或终端并安装这些东西

pip install numpy pip install pandas pip install jupyter

Numpy will be used to store the inputted images. And pandas to work with CSV files. Jupyter notebook is what you need to interactively code with Python.

Numpy将用于存储输入的图像。 和熊猫一起使用CSV文件。 Jupyter笔记本是使用Python进行交互代码所需要的。

Then, go to the Pytorch website and download what you need. You might need the CUDA version to fasten your training speed. But make sure that you have version 1.0+ for Pytorch.

然后,访问Pytorch网站并下载所需内容。 您可能需要CUDA版本以加快培训速度。 但是请确保您具有Pytorch的1.0+版本。

After that, install torchvision and FastAI:

之后,安装torchvision和FastAI:

pip install torchvision pip install fastai

Run Jupyter with the command jupyter notebook and it will open a browser window.

使用命令jupyter notebook运行Jupyter,它将打开浏览器窗口。

Now, you are ready to go.

现在,您准备好了。

准备数据 (Prepare the Data)

Import the necessary code:

导入必要的代码:

import numpy as npimport pandas as pd from pathlib import Path from fastai import * from fastai.vision import * import torch %matplotlib inline

Numpy and Pandas are always needed for everything you want to do. FastAI and Torch are your Deep Learning Library. Matplotlib Inline will be used to show charts.

您想要做的所有事情总是需要Numpy和Pandas。 FastAI和Torch是您的深度学习库。 Matplotlib Inline将用于显示图表。

Now, download data files from the competition website.

现在,从比赛网站下载数据文件。

Extract the zip data file and put them inside the jupyter notebook folder.

解压缩zip数据文件并将其放入jupyter笔记本文件夹中。

Let’s say you named your notebook Cacti. Your folder structure would be like this:

假设您将笔记本命名为Cacti。 您的文件夹结构如下所示:

Train folder contains all the images for your training step.

训练文件夹包含训练步骤的所有图像。

Test folder contains all the images for submission.

测试文件夹包含所有要提交的图像。

Train CSV file contains the training data; mapping the image name with the column has_cactus which gives a value of 1 if it has cactus or 0 otherwise.

训练CSV文件包含训练数据; 将图像名称与has_cactus列映射,如果它具有仙人掌则为1,否则为0。

Sample Submission CSV file contains all the formatting for submission that you need to do. The file names stated there are equal to all files inside Test folder.

样本提交CSV文件包含所需的所有提交格式。 声明的文件名等于Test文件夹中的所有文件。

train_df = pd.read_csv("train.csv")

Load the Train CSV file to a data frame.

将Train CSV文件加载到数据框中。

data_folder = Path(".") train_images = ImageList.from_df(train_df, path=data_folder, folder='train')

Create a load generator using the ImageList from_df method to map train_df data frame with images inside the train folder.

使用ImageList的from_df方法创建负载生成器,以将train_df数据框与train文件夹内的图像进行映射。

数据扩充 (Data Augmentation)

This is a technique to create more data from your existing data. An image of a cat flipped vertically is still a cat. By doing this you can basically multiply your data set twice, four times, or even 16 times.

这是一种从现有数据中创建更多数据的技术 垂直翻转的猫的图像仍然是猫。 通过这样做,您基本上可以将数据集乘以两倍,四倍,甚至十六倍。

You will need this technique a lot if you happen to have very little data to work with.

如果您碰巧需要处理的数据很少,那么您将非常需要此技术。

transformations = get_transforms(do_flip=True, flip_vert=True, max_rotate=10.0, max_zoom=1.1, max_lighting=0.2, max_warp=0.2, p_affine=0.75, p_lighting=0.75)

FastAI gives you a nice transformation method to do all of this, called get_transform. You can do a flip vertically, horizontally, rotate, zoom, add lighting/brightness, and warp the image.

FastAI为您提供了一个不错的转换方法,称为get_transform 。 您可以垂直,水平翻转,旋转,缩放,添加照明/亮度并扭曲图像。

You can play with the parameter I stated above to find out how it will look. Or you can open the documentation and read about it in detail.

您可以使用我上面所述的参数来查找其外观。 或者,您可以打开文档并详细阅读。

Of course, apply the transformation to your image list:

当然,将转换应用于图像列表:

train_img = train_img.transform(transformations, size=128)

The parameter size will be used to scale up or down the input to match with the neural network you will use. The network I will use is called DenseNet, which won Best Paper Award at ImageNet 2017, and it needs images with 128×128 pixels in size.

参数大小将用于按比例放大或缩小输入,以与您将使用的神经网络相匹配。 我将使用的网络称为DenseNet ,该网络在ImageNet 2017上获得了最佳论文奖,它需要的图像尺寸为128×128像素。

培训准备 (Training Preparation)

After loading your data, you need to prepare yourself and your data for the most important phase in Deep Learning called Training. Basically, this is the Learning in Deep Learning. It learns from your data, and updates itself accordingly so that it will have good performance on your data.

加载数据后,您需要准备好自己和数据,以应对深度学习中最重要的阶段,即培训。 基本上,这就是深度学习 。 它会从您的数据中学习,并进行相应的更新,以使其在数据上具有良好的性能。

test_df = pd.read_csv("sample_submission.csv") test_img = ImageList.from_df(test_df, path=data_folder, folder='test')
train_img = train_img           .split_by_rand_pct(0.01)           .label_from_df()           .add_test(test_img)           .databunch(path='.', bs=64, device=torch.device('cuda:0'))                       .normalize(imagenet_stats)

For the training step, you need to split some of your training data into a small portion called validation data. You can’t touch these data because they will be your validation tool. When your Convolutional Neural Network performs well on validation data, it will likely perform well on the test data that will be submitted.

对于训练步骤,您需要将一些训练数据分成一小部分,称为验证数据 。 您无法触摸这些数据,因为它们将成为您的验证工具。 当您的卷积神经网络在验证数据上表现良好时,它可能会在将要提交的测试数据上表现良好

FastAI has the convenient method called split_by_rand_pct to split a portion of your data into validation data.

FastAI具有称为split_by_rand_pct的便捷方法,可将您的部分数据拆分为验证数据。

It also has the method databunch to perform batch processing. I used 64 as the batch because that is what my GPU limits. If you don’t have GPU, emit the device parameter.

它还具有方法databunch来执行批处理。 我使用64作为批处理,因为这是我的GPU限制。 如果没有GPU,请发出设备参数。

Then, the normalize method is called to normalize your images because you will use a pre-trained network. imagenet_stats will normalize the images according to how the pre-trained network was trained for the ImageNet competition.

然后,将调用normalize方法来规范化图像,因为您将使用预先训练的网络。 imagenet_stats将根据为ImageNet竞赛训练的预训练网络对图像进行归一化。

Adding the test data to the training image list makes it easy to predict later on without more pre-processing. Remember, these images will not be trained on and will not go to your validation. You just want to pre-process the data in the same way with the training images.

将测试数据添加到训练图像列表使以后无需进行更多预处理即可轻松进行预测。 请记住,这些图像将不会接受训练,也不会通过验证。 您只想以与训练图像相同的方式预处理数据。

learn = cnn_learner(train_img, models.densenet161, metrics=[error_rate, accuracy])

You are done preparing your training data. Now, create a training method with cnn_learner. As I said before, I will use DenseNet as the pre-trained network. You can use another network offered in TorchVision.

您已经准备好训练数据。 现在,使用cnn_learner创建一种训练方法。 如前所述,我将使用DenseNet作为预先训练的网络。 您可以使用TorchVision中提供的另一个网络。

单周期技术 (The One-Cycle Technique)

You can start your training right now. But, there is always confusion when training any Deep Neural Network, Convolutional Neural Networks included. That is choosing the right learning rate. The algorithm is called Gradient Descent, and it will try to decrease the error defined with a parameter called learning rate.

您可以立即开始培训。 但是,在训练任何深层神经网络(包括卷积神经网络)时总是会感到困惑。 那就是选择正确的学习率 。 该算法称为“梯度下降”,它将尝试减少由称为学习率的参数定义的误差。

A bigger learning rate makes the training steps faster, but it is prone to overstepping the boundaries. This makes it possible for the error to go out of control like the picture above. While a smaller learning rate makes the training steps slower, but it will not go out of control.

更高的学习率可以使培训步骤更快 ,但是容易超越界限。 这样就可以使错误失控,如上图所示。 虽然较小的学习速度会使 训练步骤变慢,但它不会失控。

So, choosing the right learning rate is really important. Make it big enough without going out of control.

因此,选择正确的学习率非常重要。 使它足够大而不会失控。

It is easier said than done.

说起来容易做起来难。

So, there came a person called Leslie Smith who create a technique called the 1-cycle policy.

因此,出现了一个名为莱斯利·史密斯(Leslie Smith)的人,他创造了一种称为1周期策略的技术 。

Intuition wise, you might want to find / brute force several learning rates and find one with nearly minimal error but have some space to improve. Let’s try it out in our code.

直觉明智的做法是,您可能想找到/蛮力地提高几种学习率,并找到一种误差极小但有一定改进空间的学习率。 让我们在我们的代码中尝试一下。

learn.lr_find() learn.recorder.plot()

It will print something like this:

它将打印如下内容:

The minimum should be 10 ⁻¹. So, I think we can use something smaller than that but not too small. Maybe 3 * 10 ⁻ ² is a good choice. Let’s try it!

最小值应为10 -1。 因此,我认为我们可以使用比此小的但不能太小的东西。 也许3 * 10²²是个不错的选择。 试试吧!

lr = 3e-02 learn.fit_one_cycle(5, slice(lr))

Train for several steps (I choose 5, not too big and not too small), and let’s see the result.

训练几个步骤(我选择5,不要太大也不能太小),然后看看结果。

Wait, what!?

等等,什么!

Our simple solution gives us 100% accuracy for our validation split! It is actually effective. And it only needs six minutes to train. What a stroke of luck! In real life, you will do several iterations just to find out which algorithms do better than the others.

我们简单的解决方案为我们的验证拆分提供了100%的准确性! 它实际上是有效的。 而且训练只需要六分钟。 真是幸运! 在现实生活中,您将进行几次迭代,只是找出哪些算法比其他算法更好。

I am eager to submit! Haha. Let’s predict the test folder and submit the result.

我渴望提交! 哈哈。 让我们预测一下测试文件夹并提交结果。

preds,_ = learn.get_preds(ds_type=DatasetType.Test) test_df.has_cactus = preds.numpy()[:, 0]

Because you have already put the test images in the training image list, you will not need to pre-process and load your test images.

因为您已经将测试图像放置在训练图像列表中,所以不需要预处理和加载测试图像。

test_df.to_csv('submission.csv', index=False)

This line will create a CSV file containing the images name and has a cactus column for all 4,000 test images.

该行将创建一个包含图像名称的CSV文件,并具有用于所有4,000张测试图像的仙人掌列。

When I tried to submit, I actually just realized that you need to submit the CSV via a Kaggle kernel. I missed that.

当我尝试提交时,我实际上只是意识到您需要通过Kaggle内核提交CSV。 我错过了。

But, luckily, the kernel is actually the same as your jupyter notebook. You can just copy paste all the things you have built in your notebook and submit there.

但是,幸运的是, 内核实际上与jupyter笔记本相同 。 您可以只复制粘贴笔记本中已构建的所有内容,然后在此处提交。

And BAM!

还有BAM

Good Lord! I get 0.9999 for the public score. That’s really good. But, of course, I want to get a perfect score if my first attempt is like that.

好主啊! 我的公开得分为0.9999。 这非常好。 但是,当然,如果我的第一次尝试是这样,我想获得满分。

So, I did several tweaks in the network and once more, BAM!

因此,我在网络中进行了一些调整,然后再次进行了BAM!

I did it! So can you. It’s actually not that hard.

我做的! 你也可以 实际上并不难。

(BTW, this rank was taken on April 13th, so I might drop my rank right by now…)

(顺便说一句,这个排名是在4月13日确定的,所以我现在可能会降低排名...)

我学到的是 (What I Learned)

This problem is easy. So, you will not face any weird challenge while solving it. This makes it one of the most suitable projects to start with.

这个问题很容易。 因此,您在解决它时不会面临任何奇怪的挑战。 这使其成为最适合的项目之一。

Alas, because many people get a perfect score on this, I think the admin needs to create another test set for submission. A harder one maybe.

,因为许多人对此都取得了满分,所以我认为管理员需要创建另一个测试集以进行提交。 也许更难一些。

Whatever the reason, there is no barrier for you to try this. You can try this right now and get good results.

无论出于何种原因,尝试此操作都没有障碍。 您可以立即尝试使用此方法,并获得良好的效果

最后的想法 (Final Thoughts)

Convolutional Neural Networks are so helpful for various tasks. From Image Recognition to generating images. Analyzing images nowadays is not as hard as before. Of course, you can also do it if you try.

卷积神经网络对于完成各种任务非常有帮助。 从图像识别到生成图像。 如今分析图像并不像以前那样困难。 当然,如果尝试也可以这样做。

Just get started, pick a good Convolutional Neural Network project, and get good data.

刚开始,请选择一个好的卷积神经网络项目,并获得良好的数据。

Good luck!

祝好运!

This article is originally published on my blog at thedatamage.

本文最初在我的博客上的datamage上发布 。

翻译自: https://www.freecodecamp.org/news/everything-you-need-to-know-to-master-convolutional-neural-networks-ef98ca3c7655/

卷积网络和卷积神经网络

你可能感兴趣的:(神经网络,python,机器学习,人工智能,深度学习)