【文献阅读】植物病虫害识别小数据集实现高识别率Crop Disease Image Classification Based on Transfer Learning With DCNNs

一、背景

文章题目:Crop Disease Image Classification Based on Transfer Learning With DCNNs
文章下载地址 DOI:10.1007/978-3-030-03335-4_40:
项目地址:无
关键词:Transfer learning · Deep learning · Image classification DCNN · Crop diseases

二、文章简要介绍

文章摘要部分:

Abstract. Machine learning has been widely used in the crop disease image classification. Traditional methods relying on the extraction of hand-crafted low-level image features are difficulty to get satisfactory results. Deep convolutional neural network can deal with this problem because of automatically learning the feature representations from raw image data, but require enough labeled data to obtain a good generalization performance. However, in the field of agriculture, the available labeled data in target task is limited. In order to solve this problem, this paper proposes a method which combines transfer learning with two popular deep learning architectures (i.e., AlexNet and VGGNet) to classify eight kinds of crop diseases images. First, during the training procedure, the batch normalization and DisturbLabel techniques are introduced into these two networks to reduce the number of training iterations and overfitting. Then, after training the pre-trained model by using the open source dataset PlantVillage. Finally, we fine-tune this model with our relatively small dataset preprocessed by a proposed strategy. The experimental results reveal that our approach can achieve an average accuracy of 95.93% compared to state-of-the-art method for our relatively small dataset, demonstrating the feasibility and robustness of this approach.

研究现状:今年来,随着计算机视觉和机器学习技术的应用,农作物病虫害识别领域取得了很大的发展,支持向量机(SVM),K近邻(KNN)和判别分析(discriminant),并且深度神经网络被广泛的运用的图像识别领域中,并且达到较高的精度。

三、目的

利用深度卷积神经网络代替传统依赖手工提取图像底层特征来解决农作物病害图像分类问题。

四、发现问题

存在的问题有数据集中的样本太少,要对病害区域分割,提取出一些特定的特征,这对于某些作物病害往往是不容易的。作物病害信息也不能完全用其特有的特征来表示。

五、解决方法

卷积神经网络可以通过其卷积层和池化层直接从原始图像中提取特征进行识别和分类,先让网络在 PlantVillage(开源数据集包含38个类别,5万多张图像)中训练得到预训练模型,再利用迁移学习与两种神经网络相结合在原始数据集上微调(fine-tuning),并在训练中将batch normalization和DisturbLabel技术引入两个网络。并且设置了VGGNet和AlexNet,Center和Corner数据集以及BN和DL等多组对比实验。
【文献阅读】植物病虫害识别小数据集实现高识别率Crop Disease Image Classification Based on Transfer Learning With DCNNs_第1张图片
VGGNet
【文献阅读】植物病虫害识别小数据集实现高识别率Crop Disease Image Classification Based on Transfer Learning With DCNNs_第2张图片
AlexNet

六、数据集处理方法

数据集分为中心裁剪和角裁剪,中心裁剪:在中心裁剪中,我们从每个图像的中心裁剪 300 300 平方的区域。因此,可以删除大多数复杂的背景,并且图像数量保持不变。角裁剪:在角裁剪中,我们首先将裁剪中心区域达到512 512分辨率,保持了最复杂的背景。然后,我们将图像分成四部分,分辨率为 256 256。最后,我们将这些图像分别使用双线性插值调整为两种不同的大小(AlexNet 和 224 224 像素的 VGGNet)。对每幅图像执行上述操作并过滤无病变区域的图像。
【文献阅读】植物病虫害识别小数据集实现高识别率Crop Disease Image Classification Based on Transfer Learning With DCNNs_第3张图片

七、结果

During training the pre-trained models, for comparison, we train the models by using the PlantVillage dataset on two different network architectures. The dataset is split into two sets, namely training set (80% of the dataset) and validation set (20% of the dataset). Since the learning always converges well within 100 epochs based on the empirical observation, each of these experiments runs for 100 epochs, where one epoch is defined as the number of training iterations in which the neural network has completed a full pass of the whole training set

通过对比实验发现,在corner数据集上准确率不如center数据集,主要原因是数据集的数量太小。并且在神经网络中同时加入BN层和DL层情况下精度会更加高。在相对较小的数据集上,这种方法平均准确率达到了95.93%,证明了这种方法的可行性
【文献阅读】植物病虫害识别小数据集实现高识别率Crop Disease Image Classification Based on Transfer Learning With DCNNs_第4张图片
【文献阅读】植物病虫害识别小数据集实现高识别率Crop Disease Image Classification Based on Transfer Learning With DCNNs_第5张图片

八、超参设置

Dorpout设置为0.5,学习率0.01,衰减率为0.98,变量的随机初始化用均值和0.01的方差正太分布初始化。训练集(数据集的80%),验证数据集validation(数据集的20%)。

九、补充

Disturblabel算法:在每次迭代过程中,我们随机选择一样的样本,使用错误的标记值进行训练,我们发现这个简单的方法可以很好的防止CNN模型过拟合,并且可以和Dropout一起使用得到很好的效果。
Batch NormLization:BN是一种解决深度神经网络层数太多,而没有办法有效向前传递的问题,因为每一层的输出值都会有不同的均值,和方差,所以输出的数据分布也不一样。
优点:
1.他不仅加快模型的收敛速度,而且更重要的是在一定程度上缓解了深度网络中的“梯度弥散”(就是在靠近输出层的hidden layer梯度越大,参数更新快,但是靠近输入层的hidden layer梯度小,参数更新慢,几乎和初始状态一样,随机分布。梯度爆炸与之相反)总体来说就是梯度相当不稳定。
2.控制过拟合可以减少或不用dropout
3.降低网络对初始化权重不敏感
4.允许使用较大的学习率
所以使用BN可以使得模型训练更加容易和稳定

TODO

在选择辅助训练的数据集时,选择更好的预训练数据集,使用更深的网络等。

你可能感兴趣的:(文献阅读,图像分类)