原文 | 译文 |
---|---|
Early diagnosis and accurate identification of apple tree leaf diseases (ATLDs) can control the spread of infection, to reduce the use of chemical fertilizers and pesticides, improve the yield and quality of apple, and maintain the healthy development of apple cultivars. In order to improve the detection accuracy and efficiency, an early diagnosis method for ATLDs based on deep convolutional neural network (DCNN) is proposed. We first collect the images of apple tree leaves with and without diseases from both laboratories and cultivation fields, and establish dataset containing five common ATLDs and healthy leaves. The DCNN model proposed in this paper for ATLDs recognition combines DenseNet and Xception, using global average pooling instead of fully connected layers. We extract features by the proposed convolutional neural network then use a support vector machine to classify the apple leaf diseases. Including the proposed DCNN, several DCNNs are trained for ATLDs recognition. The proposed network achieves an overall accuracy of 98.82% in identifying the ATLDs, which is higher than Inception-v3, MobileNet, VGG-16, DenseNet-201, Xception, VGG-INCEP. Moreover, the proposed model has the fastest convergence rate, and a relatively small number of parameters and high robustness compared with the mentioned models. This research indicates that the proposed deep learning model provides a better solution for ATLDs control. It could be also integrated into smart apple cultivation systems. | 苹果树叶病害(ATLDs)的早期诊断和准确识别可以控制感染的传播,减少化肥和农药的使用,提高苹果的产量和品质,维护苹果品种的健康发展。为了提高检测精度和效率,提出了一种基于深度卷积神经网络(DCNN)的ATLD早期诊断方法。我们首先从实验室和栽培田收集有病害和无病害的苹果树叶图像,并建立包含五种常见 ATLD 和健康叶子的数据集。本文提出的用于ATLDs识别的DCNN模型结合了DenseNet和Xception,使用全局平均池化而不是全连接层。我们通过提出的卷积神经网络提取特征,然后使用支持向量机对苹果叶病进行分类。包括提议的 DCNN,有几个 DCNN 被训练用于 ATLDs 识别。所提出的网络在识别 ATLD 方面的总体准确率为 98.82%,高于 Inception-v3、MobileNet、VGG-16、DenseNet-201、Xception、VGG-INCEP。此外,与上述模型相比,所提出的模型具有最快的收敛速度、相对较少的参数和较高的鲁棒性。这项研究表明,所提出的深度学习模型为 ATLD 控制提供了更好的解决方案。它还可以集成到智能苹果种植系统中。 |
Keywords: apple tree leaf diseases; deep convolutional neural network; transfer learning; model fusion |
关键词:苹果树叶病害; 深度卷积神经网络; 迁移学习; 模型融合 |
原文 | 译文 |
---|---|
Mosaic, Rust, Grey spot, Brown spot, and Alternaria leaf spot are five common apple tree leaf diseases. Early diagnosis and accurate identification of apple tree leaf diseases (ATLDs) can effectively control the spread of infection, reduce losses, and ensure the apple industry’s healthy growth. Traditional plant leaf disease recognition methods mainly rely on expert experiences to manually extract the color, texture, and shape features of disease leaf images [1–3]. Due to the complexity and diversity of the captured backgrounds and the disease spots [4], artificially extracted features using image analysis methods are usually limited to specific dataset, when transferred to new dataset, the identification accuracy is not ideal. Furthermore, most of the existing apple disease dataset include images with pure background, dataset with natural cultivation background need to collect to meet the needs of apple disease identification in the environment of natural field. | 花叶病、锈病、灰斑病、褐斑病和链格孢属叶斑病是五种常见的苹果树叶病害。 苹果树叶病害(ATLDs)的早期诊断和准确识别可以有效控制感染的传播,减少损失,确保苹果产业的健康发展。 传统的植物叶片病害识别方法主要依靠专家经验手动提取病叶图像的颜色、纹理和形状特征[1-3]。 由于捕获的背景和病斑的复杂性和多样性[4],使用图像分析方法人工提取的特征通常仅限于特定数据集,当转移到新数据集时,识别精度并不理想。 此外,现有的苹果病害数据集大多包含纯背景图像,需要采集自然栽培背景的数据集以满足自然田间环境下苹果病害识别的需要。 |
Deep convolutional neural networks (DCNNs) have good performances in processing twodimensional data, especially in image and video classification tasks [5]. Lee et al. proposed a convolutional neural network (CNN) system that used plant leaves to automatically identify plants [6]. In 2015, Kawasaki et al. studied the recognition of cucumber foliar diseases based on CNNs, which classified two common cucumber leaf diseases and healthy leaves with an average accuracy of 94.9% [7]. The results showed that the classification features extracted by the CNN-based network model could obtain the best classification performance. In 2016, Sladojevic et al. used deep neural networks to identify 13 common plant diseases. Results showed that their model had an average recognition accuracy of 96.3% [8]. Mohanty et al. used AlexNet and GoogLeNet networks with transfer learning methods to identify 26 diseases of 14 crops in the PlantVillage dataset, and the accuracy on a given test dataset was 99.35% [9]. Ferentinos et al. used a CNN model to identify plant diseases in 2018, from a public dataset with 87,848 images and 58 diseases of 25 species. Their results showed that the highest accuracy could reach 99.53%, and the model could be used as a tool for early warning of plant diseases [10]. Long et al. used AlexNet and GoogLeNet to conduct experiments which compared the learning performance of scratch learning methods and transfer learning methods. They fine-tuned the DCNNs to identify four leaf diseases and healthy leaves of Camellia oleifera. The experimental results showed that the accuracy of DCNN was 96.53%, and transfer learning could accelerate network convergence and improve classification performance [11]. | 深度卷积神经网络 (DCNN) 在处理二维数据方面具有良好的性能,尤其是在图像和视频分类任务中 [5]。李等人提出了一种使用植物叶子自动识别植物的卷积神经网络 (CNN) 系统 [6]。 2015 年,川崎等人研究了基于CNNs对黄瓜叶病的识别,将黄瓜常见的叶病和健康叶分为两种,平均准确率为94.9%[7]。结果表明,基于CNN的网络模型提取的分类特征可以获得最佳的分类性能。 2016 年,Sladojevic 等人使用深度神经网络识别 13 种常见植物病害。结果表明,他们的模型的平均识别准确率为 96.3% [8]。莫汉蒂等人使用 AlexNet 和 GoogLeNet 网络和迁移学习方法识别 PlantVillage 数据集中 14 种作物的 26 种病害,在给定测试数据集上的准确率为 99.35% [9]。费伦蒂诺斯等人使用 CNN 模型识别 2018 年的植物病害,从包含 87,848 张图像和 25 个物种的 58 种病害的公共数据集中。他们的结果表明,最高准确率可以达到 99.53%,该模型可以作为植物病害预警的工具[10]。龙等人使用 AlexNet 和 GoogLeNet 进行实验,比较了从头学习方法和迁移学习方法的学习性能。他们对 DCNN 进行了微调,以识别油茶的四种叶子病害和健康叶子。实验结果表明,DCNN的准确率为96.53%,迁移学习可以加速网络收敛,提高分类性能[11]。 |
In 2017, Zhang et al. proposed an ATLDs recognition method based on image processing technology and pattern recognition for three types of ATLDs and healthy leaves [12]. Their dataset included 90 images of healthy apple leaves and leaves with white powder, Mosaic, and Rust diseases. The disease identification accuracy of their method was higher than 90%. In 2017, Liu et al. designed a DCNN based on AlexNet for the identification of four ATLDs. The accuracy reached 97.62% on the dataset containing Mosaic, Rust, Brown spot, and Alternaria leaf spot [13]. In 2019, Baranwal. et al. designed a CNN based on LeNet-5 for the identification of three types of ATLDs and healthy leaves. On the dataset with mostly laboratory background containing Black Rot, Rust, Apple Scab, and healthy leaves, the accuracy reached 98.54% [14]. In 2019, Jiang et al. proposed a CNN model named VGG-INCEP for ATLDs including Mosaic, Rust, Grey spot, Brown spot, and Alternaria leaf spot, which achieves the accuracy of 97.14%, and created a real-time fast disease detection model achieving 78.80% mean average accuracy [15]. In 2020, Yong Zhong et al. proposed three loss functions based on the DenseNet-121 deep convolutional network. On the dataset of general Apple Scab, serious Apple Scab, Grey spot, general Rust, serious Rust, and healthy leaves, the accuracy rates are 93.51%, 93.31% and 93.71% for the three loss functions, which are better than the accuracy of cross-entropy loss function [16]. In 2020, Yu et al. proposed a DCNN based on the region of interest to identify ATLDs. A total of 404 images containing Brown spot, Alternaria leaf spot and healthy leaves were identified. On the dataset, the recognition accuracy rate of 84.3% was achieved [17]. In 2020, Albayati et al. proposed a DCNN that combined speeded up robust feature extraction and grasshopper optimization algorithm feature for the identification of three ATLDs and healthy leaves. On the dataset of Black Rot, Rust, Apple Scab, and healthy leaves, the accuracy reached 98.28% [18]. | 2017 年,张等人提出了一种基于图像处理技术和模式识别的ATLDs识别方法,用于三种类型的ATLDs和健康叶片[12]。他们的数据集包括 90 张健康的苹果叶子和带有白色粉末、马赛克和锈病的叶子的图像。他们的方法的疾病识别准确率高于90%。 2017 年,刘等人设计了一个基于 AlexNet 的 DCNN,用于识别四个 ATLD。在包含 Mosaic、Rust、Brown spot 和 Alternaria 叶斑病的数据集上,准确率达到了 97.62% [13]。 2019 年,巴兰瓦尔。等。设计了一个基于 LeNet-5 的 CNN,用于识别三种类型的 ATLD 和健康叶子。在以实验室背景为主的包含黑腐病、锈病、苹果黑星病和健康叶子的数据集上,准确率达到了 98.54% [14]。 2019 年,Jiang 等人提出了一个名为 VGG-INCEP 的 CNN 模型,用于包括 Mosaic、Rust、Grey spot、Brown spot 和 Alternaria 叶斑在内的 ATLD,其准确率达到 97.14%,并创建了一个实时快速疾病检测模型,平均准确率达到 78.80% [15]。 2020年,永忠等人提出了三个基于 DenseNet-121 深度卷积网络的损失函数。在一般苹果黑星病、严重苹果黑星病、灰斑病、一般锈病、严重锈病和健康叶片的数据集上,三种损失函数的准确率分别为93.51%、93.31%和93.71%,优于交叉熵损失函数[16]。 2020 年,Yu 等人提出了一种基于感兴趣区域的 DCNN 来识别 ATLD。共识别出 404 张包含褐斑病、链格孢属叶斑病和健康叶片的图像。在数据集上,识别准确率达到了 84.3% [17]。 2020 年,Albayati 等人提出了一种结合加速鲁棒特征提取和蚱蜢优化算法特征的 DCNN,用于识别三个 ATLD 和健康叶子。在黑腐病、锈病、苹果黑星病和健康叶子的数据集上,准确率达到了 98.28% [18]。 |
In summary, the DCNN has achieved satisfactory results in cropped disease recognition area. However, the number of ATLD types that can be identified in the existing research is limited, and the accuracy under the real usage scenario needs to be improved. |
综上所述,DCNN 在作物病害识别领域取得了令人满意的结果。 但现有研究中可识别的ATLD类型数量有限,实际使用场景下的准确性有待提高。 |
Aiming at the above problems, this study proposes a DCNN model named Xception Dense Net (XDNet) combining depthwise separable convolution [19] and densely connected structures [20], which applies transfer learning and uses the global average pooling layer instead of the fully connected layer. This paper use XDNet to extract apple leaf disease features, and use a support vector machine (SVM) to classify the diseases. Comparing the classification and recognition performance with other CNNs, the experimental results show that the identification accuracy of the proposed XDNet model is 98.82% on the testing dataset, which is higher than other mentioned CNNs with the same methods of transfer learning and data preprocessing. Moreover, using image augmentation technology and transfer learning increase the accuracy by 7.59%. | 针对上述问题,本研究提出了一种名为 Xception Dense Net (XDNet) 的 DCNN 模型,该模型结合了深度可分离卷积 [19] 和密集连接结构 [20],该模型应用迁移学习并使用全局平均池化层代替全连接层。 本文使用XDNet提取苹果叶片病害特征,并使用支持向量机(SVM)对病害进行分类。 将分类识别性能与其他CNNs进行比较,实验结果表明,所提出的XDNet模型在测试数据集上的识别准确率为98.82%,高于采用相同迁移学习和数据预处理方法的其他CNNs。 此外,使用图像增强技术和迁移学习将准确率提高了 7.59%。 |
The main contributions of this article are summarized as follows: Firstly, in order to improve the robustness of the model and reduce over-fitting, we collect apple tree diseased leaf images in laboratory and field conditions, in different seasons, at different times of the day, and with different exposure conditions. Besides, we use augmentation techniques of rotation, mirroring, Gaussian noise, salt and pepper noise, adjusting the brightness, sharpness, contrast of images [21], which have enlarged the dataset. The established dataset can well simulate the real shooting environment, image acquisition noise, light changes and transformation changes. Secondly, inspired by the depthwise separable convolution structure with residual connections used by Xception [19] and the feature reuse characteristic in the dense block of DenseNet [20], this paper proposes a DCNN model to identify ATLDs, which is a combination of depthwise separable convolution and densely connected structure. The depthwise separable convolution structure reduces network parameters, improves training speed, while dense blocks integrate shallow features into deep features better and achieve better feature reuse. The rest of the work is arranged as follows: Section 2 introduces the collection, division, and preprocessing of ATLDs dataset. Section 3 introduces the basic structure of Xception and DenseNet, and focuses on the proposed XDNet, which is a deep convolutional network model for ATLDs. Section 4 describes the workflow of the ATLDs recognition system and the proposed network performance evaluated through experiments. Finally, Section 5 summarizes the work. |
本文的主要贡献总结如下: 首先,为了提高模型的鲁棒性并减少过拟合,我们在实验室和野外条件下、不同季节、一天中的不同时间和不同的暴露条件下收集苹果树病叶图像。此外,我们使用旋转、镜像、高斯噪声、椒盐噪声等增强技术,调整图像的亮度、锐度、对比度 [21],从而扩大了数据集。建立的数据集可以很好地模拟真实拍摄环境、图像采集噪声、光线变化和变换变化。 其次,受 Xception [19] 使用的具有残差连接的深度可分离卷积结构和 DenseNet [20] 的密集块中的特征重用特性的启发,本文提出了一种 DCNN 模型来识别 ATLD,它是深度可分离的组合卷积和密集连接结构。深度可分离的卷积结构减少了网络参数,提高了训练速度,而密集块更好地将浅层特征融入深层特征,实现更好的特征复用。 其余工作安排如下:第 2 节介绍了 ATLDs 数据集的收集、划分和预处理。第 3 节介绍了 Xception 和 DenseNet 的基本结构,并重点介绍了所提出的 XDNet,它是一种用于 ATLD 的深度卷积网络模型。第 4 节描述了 ATLD 识别系统的工作流程以及通过实验评估的拟议网络性能。最后,第 5 节总结了工作。 |
原文 | 译文 |
---|---|
In order to complete the classification and identification of common ATLDs, firstly, we collect the dataset that can simulate the actual usage scenarios of the system. Then, we complete the dataset preprocessing tasks such as image scaling, dataset expansion, and dataset normalization. Finally, the dataset is divided into three parts for training, validation, and testing. | 为了完成对常见ATLD的分类识别,我们首先收集了能够模拟系统实际使用场景的数据集。 然后,我们完成了图像缩放、数据集扩展和数据集归一化等数据集预处理任务。 最后,数据集分为训练、验证和测试三部分。 |
原文 | 译文 |
---|---|
Apple tree leaf disease types vary from season, humidity, temperature, light, and other factors. Apple tree leaves may be infected by pathogenic bacteria from tree sprouts to the leaves falling off. In order to fully describe the incidences of the five apple leaf diseases selected and identified in this paper, images of apple leave with different levels of disease were shot in the laboratory (about 38.7%) and real cultivation fields (about 61.3%) with various weather conditions and time periods, which guarantees that the proposed method has higher robustness. A total of 2970 images of ATLDs and healthy leaves were collected. The dataset was evaluated by experts to ensure the validity. The dataset contains five different kinds of diseases and healthy leaves, a total of six types, including Mosaic, Rust, Grey spot, Brown spot, Alternaria leaf spot, and healthy leaves. These five apple leaf diseases are selected because they are frequently noticed in the apple growing area of Shaanxi province, P.R. China, which can cause serious economic losses. | 苹果树叶病的类型因季节、湿度、温度、光线和其他因素而异。苹果树叶可能被病原菌感染,从树芽到树叶脱落。为了全面描述本文选择鉴定的5种苹果叶片病害的发生率,分别在实验室(约38.7%)和实际栽培田(约61.3%)拍摄了不同程度病害的苹果叶片图像。天气条件和时间段,保证了所提出的方法具有更高的鲁棒性。总共收集了 2970 张 ATLD 和健康叶子的图像。数据集由专家评估以确保有效性。该数据集包含五种不同的病害和健康叶片,共六种类型,包括马赛克、锈病、灰斑病、褐斑病、链格孢属叶斑病和健康叶片。之所以选择这五种苹果叶病,是因为它们在中国陕西省的苹果种植区经常被发现,会造成严重的经济损失。 |
The lesions caused by the same disease show similarity under similar natural conditions. Figure 1 shows the representative images of five common leaf diseases and healthy leaves. It can be seen that five common diseases have obvious distinguishable visual characteristics. The bright yellow spots of Mosaic spread throughout the leaves [22]. The dark brown herpes of Brown spot is morphologically different from other lesions. Near-round yellowish brown lesions are found in the early stage of Grey spot, and then the lesions turn gray subsequently, therefore, the Grey spot in its early stage is easy to be confused with Alternaria leaf spot. The diseased spots of Alternaria leaf spot often have a dark spot or a concentric wheel pattern in the center, which distinguishes them from other lesions. Rust is composed of rusty yellow dots with brown acicular dots in the center of these dots, due to this significant difference, making it easily distinguished from other diseases [15]. Therefore, it is feasible to classify and identify common ATLDs by visual features. | 由同一疾病引起的病变在相似的自然条件下表现出相似性。图 1 显示了五种常见叶片病害和健康叶片的代表性图像。可以看出,五种常见疾病具有明显可区分的视觉特征。马赛克的亮黄色斑点遍布整个叶子[22]。褐斑的深褐色疱疹在形态上不同于其他病变。灰斑病早期出现近圆形黄褐色病斑,病斑随后变为灰色,因此,早期的灰斑病易与链格孢属叶斑病相混淆。链格孢叶斑病病斑中心常有黑斑或同心轮纹,与其他病斑区分开来。锈病是由锈迹斑斑的黄色小点组成,这些小点的中心有棕色针状小点,由于这种显着差异,很容易与其他疾病区分开来[15]。因此,通过视觉特征对常见的ATLD进行分类识别是可行的。 |
原文 | 译文 |
---|---|
Due to powerful end-to-end learning, deep learning models do not require much image preprocessing. We apply data augmentation and data normalization during the preprocessing step. | 由于强大的端到端学习,深度学习模型不需要太多的图像预处理。 我们在预处理步骤中应用数据增强和数据规范化。 |
原文 | 译文 |
---|---|
One of the main advantages of CNNs is their ability to generalize, that is, their ability to process data that has never been observed. But when the data size is not large enough and the data diversity is limited, they tend to over-fit the training data, which means they cannot be generalized [23]. In order to enhance the generalization ability of the network and reduce over-fitting, the dataset was expanded by data augmentation technology to simulate changes in lighting, exposure, angle, and noise during the preprocessing step of apple leaf images. In order to simulate these changes, images are processed by increasing and decreasing the brightness value by 30%, increasing the contrast by 50% and decreasing the contrast by 20%, increasing the sharpness by 100% and decreasing the sharpness by 70%, respectively. By using rotation (90°, 180°, 270°), flipping (horizontal and vertical), mirroring, and symmetry operation, the actual shooting angles were simulated. At the same time, in order to simulate the noise that may occur during the image acquisition process, the dataset is further enhanced by adding the interference of appropriate Gaussian noise or salt-and-pepper noise, which can reduce the over-fitting phenomenon in the CNN training stage [2,3,24]. The data augmentation techniques used in this paper are shown in Figure 2. The size of the training dataset has increased by 13 times to 24,976 images after the data augmentation. | CNN 的主要优势之一是它们的泛化能力,即它们处理从未被观察到的数据的能力。但是当数据量不够大且数据多样性有限时,它们往往会过度拟合训练数据,这意味着它们不能泛化[23]。为了增强网络的泛化能力,减少过拟合,通过数据增强技术对数据集进行扩展,以模拟苹果叶图像预处理步骤中光照、曝光、角度和噪声的变化。为了模拟这些变化,图像的处理分别是亮度值增加和减少30%,对比度增加50%,对比度减少20%,锐度增加100%,锐度降低70%。 .通过旋转(90°、180°、270°)、翻转(水平和垂直)、镜像和对称操作,模拟实际拍摄角度。同时,为了模拟图像采集过程中可能出现的噪声,通过加入适当的高斯噪声或椒盐噪声的干扰,进一步增强数据集,可以减少过拟合现象。 CNN 训练阶段 [2,3,24]。本文使用的数据增强技术如图 2 所示。 数据增强后训练数据集的大小增加了 13 倍,达到 24,976 张图像。 |
原文 | 译文 |
---|---|
Considering that the deep neural network is very sensitive to the input feature range, a large range of eigenvalues will cause instability during model training [25]. In order to improve the CNN convergence speed and learn subtle differences between images, the dataset is normalized. For each image channels, the data are normalized by Equation (1). | 考虑到深度神经网络对输入特征范围非常敏感,较大范围的特征值会在模型训练过程中造成不稳定[25]。 为了提高CNN收敛速度和学习图像之间的细微差异,数据集被归一化。 对于每个图像通道,数据通过等式 (1) 进行归一化。 |
原文 | 译文 |
---|---|
For the training and testing of DCNNs, the dataset with 2970 images were divided into three independent subsets, including training dataset, validation dataset, and testing dataset. A total 60% of the dataset composes the training dataset, 20% is used as the validation dataset, and the remaining 20% is used as the testing set, ensuring that each subset contains laboratory background image and natural cultivation background image. The training set is used to train the network, complete the automatic learning of the network, adjust the weights and biases [23]. The validation set is used to adjust the hyper-parameters of the model and perform preliminary evaluation of the model. Lastly, the testing set is used to evaluate the generalization ability of the final model. The aforementioned data augmentation techniques are applied on the training dataset, the validation and testing datasets are not augmented. After performing the above data preprocessing, Table 1 shows the number of images in the training dataset, validation dataset, and testing dataset. | 对于 DCNNs 的训练和测试,将 2970 张图像的数据集分为三个独立的子集,包括训练数据集、验证数据集和测试数据集。总共60%的数据集构成训练数据集,20%作为验证数据集,剩下的20%作为测试集,确保每个子集都包含实验室背景图像和自然栽培背景图像。训练集用于训练网络,完成网络的自动学习,调整权重和偏差[23]。验证集用于调整模型的超参数并对模型进行初步评估。最后,测试集用于评估最终模型的泛化能力。上述数据增强技术应用于训练数据集,验证和测试数据集没有增强。进行上述数据预处理后,表1显示了训练数据集、验证数据集和测试数据集中的图像数量。 |
原文 | 译文 |
---|---|
CNN started with the original work of LeNet [27] in 1998, then AlexNet [28], ZFNet [29], VGG [30], GoogLeNet [31], ResNet [32], DenseNet [20], Xception [19] etc., appeared. The network is getting deeper, the architecture is becoming more complex, and the method for solving the disappearance of gradients during back propagation is becoming more delicate. Xception uses depthwise separable convolution to separate the spatial convolution and channel convolution operations, and improves the network performance with reduced parameters and calculations. Its residual structure improves gradient dissipation during back-propagation and increases the expression ability of the model. The dense connection structure of DenseNet enhances feature transfer and makes more effective use of features with fewer parameters, but DenseNet consumes a relatively large amount of memory during training. In short, the DenseNet model has good feature reuse capabilities but with large training memory consumption[33]. The depthwise separable convolution of Xception reduces the amount of parameters to a certain extent without reducing model performance. Therefore, this paper proposes the XDNet with a relatively low memory consumption, which keeps the shallow structure of Xception and uses the densely connected structure with feature reuse characteristics in DenseNet replacing the latter part of Xception. The following parts will first introduce the classic Xception and DenseNet models, and then describe thoroughly on how to fuse these two models to get XDNet. | CNN 始于 1998 年 LeNet [27] 的原始工作,然后是 AlexNet [28]、ZFNet [29]、VGG [30]、GoogLeNet [31]、ResNet [32]、DenseNet [20]、Xception [19] 等.,出现了。网络越来越深,架构越来越复杂,解决反向传播过程中梯度消失的方法也越来越精细。 Xception使用depthwise separable convolution来分离空间卷积和通道卷积操作,通过减少参数和计算来提高网络性能。它的残差结构改善了反向传播过程中的梯度耗散,增加了模型的表达能力。 DenseNet 的密集连接结构增强了特征迁移,可以更有效地利用参数较少的特征,但 DenseNet 在训练过程中消耗的内存量相对较大。简而言之,DenseNet 模型具有良好的特征重用能力,但训练内存消耗大[33]。 Xception的depthwise separable convolution在不降低模型性能的情况下,一定程度上减少了参数量。因此,本文提出了内存消耗相对较低的XDNet,它保留了Xception的浅层结构,并使用DenseNet中具有特征重用特性的密集连接结构代替了Xception的后半部分。以下部分将首先介绍经典的 Xception 和 DenseNet 模型,然后详细介绍如何融合这两个模型以获得 XDNet。 |
原文 | 译文 |
---|---|
Xception [19] is another improvement of Inception-v3 [34] proposed by Google after Inception. The Xception structure is a linear stack of depthwise separable convolutional layers with residual connections. The main features of Xception are as follows: | Xception [19] 是继 Inception 之后 Google 提出的 Inception-v3 [34] 的另一项改进。 Xception 结构是具有残差连接的深度可分离卷积层的线性堆栈。 Xception的主要特点如下: |
The Inception module is replaced by a depthwise separable convolution layer in Xception, and the standard convolution is decomposed into a spatial convolution and a point-by-point convolution. Spatial convolution operations are first performed independently on each channel, followed by point-wise convolution operation, and finally connect the results. The use of depthwise separable convolution can greatly reduce the amount of parameters and calculations with a tiny loss of accuracy. This structure is similar to the conventional convolution operation and can be used to extract features. Compared with the conventional convolution operation, the number of parameters and the calculation cost of depth-wise separable convolution are lower. | Inception模块被Xception中的depthwise separable convolution层代替,标准卷积分解为空间卷积和逐点卷积。 首先在每个通道上独立进行空间卷积运算,然后是逐点卷积运算,最后将结果连接起来。 使用depthwise separable convolution可以大大减少参数量和计算量,精度损失很小。 这种结构类似于传统的卷积操作,可以用来提取特征。 与传统的卷积操作相比,depth-wise separable convolution的参数数量和计算成本更低。 |
Xception contains 14 modules. Except for the first module and the last module, all modules have added a residual connection mechanism similar to ResNet [32], which significantly accelerates the convergence process of Xception and obtains higher accuracy rate [19]. The structure of Xception network is shown in Figure 3. The front part of the network is mainly used to continuously down sample and reduce the spatial dimension. The middle part is to continuously learn the correlation and optimize the features. The latter part is to summarize and consolidate the features, then Softmax activation function is used to calculate the probability vector of a given input class. | Xception 包含 14 个模块。 除第一个模块和最后一个模块外,所有模块都添加了类似于 ResNet [32] 的残差连接机制,显着加快了 Xception 的收敛过程,获得了更高的准确率 [19]。 Xception网络的结构如图3所示。网络的前部主要用于不断下采样和降低空间维度。 中间部分是不断学习相关性和优化特征。 后半部分是对特征进行汇总和巩固,然后使用Softmax激活函数计算给定输入类的概率向量。 |
原文 | 译文 |
---|---|
Compared with VGG, Inception-v3, Xception, and ResNet, DenseNet requires fewer parameters and reasonable calculation time to achieve the best performance [21]. The main characteristics of the DenseNet model are as follows: | 与 VGG、Inception-v3、Xception 和 ResNet 相比,DenseNet 需要更少的参数和合理的计算时间来达到最佳性能 [21]。 DenseNet模型的主要特点如下: |
The biggest feature of DenseNet is that for each layer, the function maps of all the previous layers are used as inputs, and its own function map is used as the input of all subsequent layers. It clearly distinguishes the information added to the network from the information reused. The connection scheme is shown in Figure 4, which ensures that the information flow between the layers in the network reaches the maximum, and there is no need to re-learn redundant feature mappings. Therefore, the number of parameters is greatly reduced, and the parameter efficiency is improved. The model improves the information flow and gradient of the entire network. Each layer can directly access the gradient from the loss function to the original input signal, thereby achieving an implicit deep monitoring and alleviating the problem of vanishing gradients. Moreover, the dense connection has regularization effect, so it can restrain the over-fitting on a small scale training dataset to some extent. | DenseNet 最大的特点是,对于每一层,前面所有层的函数图作为输入,它自己的函数图作为后续所有层的输入。 它清楚地区分了添加到网络中的信息和重复使用的信息。 连接方案如图4所示,保证网络中各层之间的信息流达到最大,不需要重新学习冗余特征映射。 因此,大大减少了参数数量,提高了参数效率。 该模型改善了整个网络的信息流和梯度。 每一层都可以直接访问从损失函数到原始输入信号的梯度,从而实现隐式的深度监测,缓解梯度消失的问题。 而且,密集连接具有正则化作用,因此可以在一定程度上抑制小规模训练数据集上的过拟合。 |
Function maps of the same size between any two layers are directly connected, which has good feed-forward characteristics, enhancing the feature propagation and feature reuse. | 任意两层之间相同大小的函数图直接相连,具有良好的前馈特性,增强了特征传播和特征重用。 |
DenseNet has a small number of filters per convolution operation. Only a small part of the feature map is added to the network, and the remaining feature maps are kept unchanged. This structure reduces the number of input feature maps and helps to build a deep network architecture. | DenseNet 每个卷积操作都有少量的过滤器。 只有一小部分特征图被添加到网络中,其余特征图保持不变。 这种结构减少了输入特征图的数量,有助于构建深度网络架构。 |
The structure of the DenseNet-201 [20] model is shown in Figure 4. Since the output of the dense block connects all the layers in the block, the larger the depth in the dense block is, the larger the size of the feature map becomes, which will increase the calculation costs continuously. Therefore, the transition layer is added between the dense blocks. The transition layer consists of 1 × 1 convolution and 2 × 2 average-pooling. Through the 2 × 2 average pool, the width and height can be halved to improve the computational efficiency [35]. | DenseNet-201[20]模型的结构如图4所示。由于dense block的输出连接了block中的所有层,dense block中的深度越大,特征图的尺寸也越大 成为,这将不断增加计算成本。 因此,在密集块之间添加了过渡层。 过渡层由 1 × 1 卷积和 2 × 2 平均池化组成。 通过2×2的平均池,宽度和高度可以减半以提高计算效率[35]。 |
Due to its feature reuse and hidden depth supervision characteristics, DenseNet can be naturally extended to hundreds of layers, and with the increase of depth and parameters, the accuracy can be improved to a certain degree without over-fitting and performance degradation [20]. | 由于其特征重用和隐藏深度监督特性,DenseNet可以自然地扩展到数百层,并且随着深度和参数的增加,精度可以得到一定程度的提高,而不会过拟合和性能下降[20]。 |
原文 | 译文 |
---|---|
Xception uses depthwise separable convolutions to reduce model parameters without reducing the model performance. The densely connected dense blocks of DenseNet model increases the model feature reuse capability . If these two characteristics of Xception and DenseNet are combined, it is possible to improve both feature reuse capability and model performance on the basis of a small number of parameters. Therefore, this paper proposes a new DCNN called XDNet for the identification of ATLDs, which integrates Xception and DenseNet. Due to the different levels of abstraction of the data in the multiple convolutional layers of the model, the low-level, middle-level and high-level information are extracted in the shallow, middle, and deep learning frameworks [36]. In general, the first convolutional layer extracts underlying features or small local patterns, such as edges and corners; and the last convolutional layer extracts advanced features, such as image structure. Because the high-level information has a great influence on discriminating leaf disease types [37], a dense connection structure is added to the deep layer of XDNet to improve the feature reuse performance of high-level features. The structure of XDNet is shown in Figure 5a. |
Xception 使用深度可分离卷积来减少模型参数而不降低模型性能。 DenseNet 模型的密集连接的密集块增加了模型特征重用能力 。如果将 Xception 和 DenseNet 的这两个特性结合起来,就可以在少量参数的基础上同时提高特征重用能力和模型性能。因此,本文提出了一种新的DCNN,称为XDNet,用于识别ATLD,它集成了Xception和DenseNet。由于模型的多个卷积层对数据的抽象层次不同,在浅、中、深度学习框架中提取了低层、中层和高层的信息[36]。一般来说,第一个卷积层提取底层特征或小的局部模式,如边缘和角落;最后一个卷积层提取高级特征,例如图像结构。由于高层信息对区分叶病类型有很大影响[37],因此在XDNet的深层添加了密集连接结构,以提高高层特征的特征重用性能。 XDNet 的结构如图 5a 所示。 |
原文 | 译文 |
---|---|
The first half of the model uses the structure of a depthwise separable convolution with residual connections, which is the same as in Xception, as shown in Figure 5b,c. To prevent over-fitting, batch normalization is added after each convolutional layer to avoid the problem of gradient disappearance, increase the classification effect, and greatly accelerate the convergence speed [38]. | 模型的前半部分使用了带有残差连接的depthwise separable convolution的结构,与Xception中相同,如图5b,c所示。 为防止过拟合,在每个卷积层后加入batch normalization,避免梯度消失的问题,增加分类效果,大大加快收敛速度[38]。 |
In order to enhance the feature reuse of high-level features and ensure that the information flow among the high-level layers in the network is maximized, we add dense blocks to the latter part of XDNet. As shown in Figure 5d, the dense block transfer features and gradients more efficiently, and increase model recognition effectively. At the same time, in order to effectively alleviate the problems related to over-fitting, the dropout technology is used. In the training process, some neurons are randomly selected with a given probability and discarded in the network when the weights are updated. The dropout technology prevents excessive cooperative adaptation of neurons and helps to form more meaningful independent features [37]. | 为了增强高层特征的特征重用,并保证网络中高层之间的信息流最大化,我们在XDNet的后半部分添加了dense blocks。 如图 5d 所示,密集块更有效地传递特征和梯度,并有效增加模型识别。 同时,为了有效缓解过拟合相关的问题,使用了dropout技术。 在训练过程中,以给定的概率随机选择一些神经元,并在更新权重时将其丢弃在网络中。 dropout 技术可防止神经元过度合作适应,并有助于形成更有意义的独立特征 [37]。 |
CNNs usually consist of three parts: a convolutional layer, a pooling layer, and a fully connected layer. Convolutional and pooling layers act as feature extractors for the input image, while fully connected layers act as classifiers. The basic purpose of convolution is to automatically extract features from each input image. Compared with traditional feature extractors (SIFT, Gabor, etc.), the strength of CNN lies in its ability to automatically learn the weights and biases of different feature maps, so as to generate powerful feature extractors with specific tasks [39]. | CNN 通常由三部分组成:卷积层、池化层和全连接层。 卷积层和池化层充当输入图像的特征提取器,而全连接层充当分类器。 卷积的基本目的是从每个输入图像中自动提取特征。 与传统的特征提取器(SIFT、Gabor 等)相比,CNN 的优势在于它能够自动学习不同特征图的权重和偏差,从而生成具有特定任务的强大特征提取器 [39]。 |
The activation function is executed after each convolution. Rectified linear units (ReLU) function [28] is a very popular non-linear activation function that introduces non-linearity into CNN. The ReLU function is defined in Equation (2): | 激活函数在每次卷积后执行。 修正线性单元 (ReLU) 函数 [28] 是一种非常流行的非线性激活函数,它将非线性引入 CNN。 ReLU 函数在等式 (2) 中定义: |
原文 | 译文 |
---|---|
In the ReLU layer, each negative value will be removed from the filter image and replaced with 0. | 在 ReLU 层中,每个负值都会从过滤图像中移除并替换为 0。 |
The parameters of XDNet are shown in Table 2. | XDNet的参数如表2所示。 |
原文 | 译文 |
---|---|
In XDNet, the pooling layer after the convolution layer can reduce the dimension of the feature. Each sub-sampling layer reduces the size of the convolution map, and introduces invariability for possible rotation and translation in the input, which generalizes the output of the convolution layer to a higher level. The max-pooling layer and average-pooling layer use the fixed-size sliding window and predefined step size across the feature map. The feature map is compressed to a smaller size by taking the maximum and average values of the filtered feature map, which reduces the computational complexity and control the overfitting to a certain degree [40]. | 在 XDNet 中,卷积层之后的池化层可以降低特征的维度。 每个子采样层都减小了卷积图的大小,并在输入中引入了可能的旋转和平移的不变性,从而将卷积层的输出推广到了更高的层次。 最大池化层和平均池化层在特征图上使用固定大小的滑动窗口和预定义的步长。 通过取过滤后的特征图的最大值和平均值将特征图压缩到更小的尺寸,这降低了计算复杂度并在一定程度上控制了过拟合[40]。 |
At the end of XDNet, global average pooling is used to replace the full connection without additional model parameters, which can achieve arbitrary image size input. Therefore, the model size and calculation volume are greatly reduced compared to full connection, and over-fitting can be avoided to accelerate network training [39]. The global average pooling layer extracts a 544-dimensional feature vector and directly inputs it into the classification layer, which correlates the high-level features of ATLDs with the classification task directly. A large number of practices have proved that SVM is effective in dealing with small samples, non-linear and high-dimensional pattern recognition and diagnosis [41], and CNNs achieve a small but consistent advantage of replacing the Softmax layer with linear SVM at the top[42]. At the same time, the experiments show that the compared DCNNs have the consistent advantage after using linear SVM instead of Softmax, and the classification accuracy of XDNet with linear one-vs-all SVM on the testing dataset is 0.17% higher than that of Softmax. |
在XDNet的最后,使用全局平均池化代替全连接,无需额外的模型参数,可以实现任意图像尺寸的输入。因此,与全连接相比,模型尺寸和计算量大大减少,并且可以避免过拟合以加速网络训练[39]。全局平均池化层提取一个544维的特征向量,直接输入到分类层,将ATLD的高层特征与分类任务直接关联起来。大量实践证明SVM在处理小样本、非线性和高维模式识别和诊断方面是有效的[41],并且CNNs在用线性SVM代替Softmax层方面取得了小而一致的优势[42]。同时,实验表明对比DCNNs在使用线性SVM代替Softmax后具有一致的优势,在测试数据集上使用线性一对多SVM的XDNet的分类准确率比Softmax高0.17% . |
Considering that compared with other adaptive learning rate algorithms, the Adam algorithm is easy to implement, has high computing efficiency, requires less memory, has faster convergence speed, and is resistant to the diagonal rescaling of the gradient [43]. Therefore, the Adam algorithm is used to train the neural network in the back propagation to learn the optimal weights and biases, minimize the loss in the neural network. The batch size is set to 16, the epoch is set to 50, and the base learning rate is set to 0.01. | 考虑到与其他自适应学习率算法相比,Adam 算法易于实现、计算效率高、占用内存少、收敛速度快、抗梯度对角线缩放[43]。 因此,Adam 算法用于在反向传播中训练神经网络,以学习最佳权重和偏差,最小化神经网络中的损失。 批量大小设置为 16,epoch 设置为 50,基本学习率设置为 0.01。 |
原文 | 译文 |
---|---|
XDNet is implemented in Keras deep learning framework based on CNN using python language. The configuration parameters of the experiments are listed in Table 3. | XDNet是在基于CNN的Keras深度学习框架中使用python语言实现的。 实验的配置参数列于表3中。 |
原文 | 译文 |
---|---|
ATLDs detection process is shown in Figure 6. Firstly, we collect images of diseased leaves and healthy leaves of apples from both laboratory and orchard fields. The original dataset was classified according to the disease categories by experienced professionals, and the dataset is divided into training, validation, and testing dataset. After that, we perform data augmentation on the training dataset and all images were normalized. Then, the XDNet model proposed in this paper was pretrained on a subset of PlantVillage dataset, and then the training model was migrated to the ATLDs dataset collected earlier. Finally, the specific disease type of each image in the testing dataset was detected by the model. | ATLDs检测过程如图6所示。首先,我们从实验室和果园中收集苹果病叶和健康叶的图像。 原始数据集由经验丰富的专业人员根据疾病类别进行分类,数据集分为训练数据集、验证数据集和测试数据集。 之后,我们对训练数据集进行数据增强,并对所有图像进行归一化。 然后,本文提出的 XDNet 模型在 PlantVillage 数据集的一个子集上进行预训练,然后将训练模型迁移到之前收集的 ATLDs 数据集。 最后,模型检测到测试数据集中每个图像的特定疾病类型。 |
原文 | 译文 |
---|---|
In order to test the generalization performance and stability of the XDNet model, this paper performs the cross-validation five times. A total 20% of the dataset is selected as the testing dataset, and the remaining 80% of the dataset is divided into training dataset and validation dataset with a ratio of 3:1 five times by random permutation, ensuring that the ratio of images with the field background and the laboratory background in each subset is consistent. Five models are obtained through training, and classification accuracies on the test set of these five models are 98.82%, 98.65%, 98.15%, 98.15%, and 97.98%, respectively. The average classification accuracy of the five models is 98.35%, and the standard deviation is 0.363%, these number show that XDNet has good stability. Take the model with accuracy of 98.82% as an example, it is analyzed below. | 为了测试XDNet模型的泛化性能和稳定性,本文进行了5次交叉验证。 选择总共20%的数据集作为测试数据集,其余80%的数据集以3:1的比例进行五次随机排列分成训练数据集和验证数据集,确保图像与验证数据的比例 每个子集中的野外背景和实验室背景是一致的。 通过训练得到五个模型,这五个模型在测试集上的分类准确率分别为98.82%、98.65%、98.15%、98.15%和97.98%。 五个模型的平均分类准确率为 98.35%,标准差为 0.363%,这些数字表明 XDNet 具有良好的稳定性。 以准确率为98.82%的模型为例,分析如下。 |
According to the predicted label of the testing data and the real label, the confusion matrix is made as shown in Figure 7. The rows in the figure represent the original categories, and the columns represent the predicted categories. All correct predictions are on the diagonal cubes. The darker the color of the cube is, the greater the probability it represents. Figure 7 shows that mosaic was 100% recognized by the model. The classification rated of Alternaria leaf spot is 96.36%, which is due to the amount of Alternaria leaf spot images in the training dataset is relatively less. Furthermore, and for the reasons of similar geometric features between Alternaria leaf spot and Grey spot plus the complexity of exposure condition, the model is easily confused in distinguishing Alternaria leaf spot disease from Grey spot disease. | 根据测试数据的预测标签和真实标签,制作混淆矩阵如图7所示,图中行代表原始类别,列代表预测类别。 所有正确的预测都在对角立方体上。 立方体的颜色越深,它代表的概率就越大。 图 7 显示马赛克被模型 100% 识别。 Alternaria 叶斑病的分类率为 96.36%,这是由于训练数据集中 Alternaria 叶斑病图像的数量相对较少。 此外,由于链格孢叶斑病和灰斑病的几何特征相似,加上暴露条件的复杂性,该模型在区分链格孢属叶斑病和灰斑病时很容易混淆。 |
原文 | 译文 |
---|---|
The advantages of transfer learning are that it can reduce the number of images required for training, reduce model training costs, shorten training time, alleviate over-fitting and so on [44]. We selected 4213 leaf images of 5 plant species (tomato, cucumber, chili, apple, and grape) from PlantVillage dataset. XDNet was pre-trained on these images, so that a pre-trained model with prior knowledge of crop leaves has been established. The shallow layers of the pre-trained network extract general, low-level features, such as plant leaf edges. These features do not change significantly and are suitable for many data sets and tasks [45]. Therefore, the pre-trained model can be migrated to the ATLDs recognition task. Figure 8 is the comparison of the classification accuracy and convergence rate of the XDNet with and without transfer learning. Acc_1 and loss_1 are the accuracy and loss values of the model running on the validation set with transfer learning, and acc_2 and loss_2 are those without transfer learning. Comparative experiments show that the accuracy of the model with transfer learning is 1.35% higher than the one without transfer learning on the testing dataset. Better convergence is also obtained through transfer learning. |
迁移学习的优点在于可以减少训练所需的图像数量、降低模型训练成本、缩短训练时间、缓解过拟合等[44]。我们从 PlantVillage 数据集中选择了 5 种植物(番茄、黄瓜、辣椒、苹果和葡萄)的 4213 张叶子图像。 XDNet 在这些图像上进行了预训练,从而建立了具有作物叶子先验知识的预训练模型。预训练网络的浅层提取一般的低级特征,例如植物叶子边缘。这些特征没有显着变化,适用于许多数据集和任务 [45]。因此,预训练的模型可以迁移到 ATLDs 识别任务中。图 8 是 XDNet 有无迁移学习的分类准确率和收敛速度对比。 Acc_1 和 loss_1 是模型在使用迁移学习的验证集上运行的准确率和损失值,acc_2 和 loss_2 是没有迁移学习的模型。对比实验表明,在测试数据集上,有迁移学习的模型的准确率比没有迁移学习的模型高 1.35%。 通过迁移学习也可以获得更好的收敛性。 |
原文 | 译文 |
---|---|
Data augmentation can help alleviate the problem of over-fitting in CNN’s training stage. In order to diagnose diseases from images collected during the practical use of the model with various brightness, sharpness and contrasts, this paper augmented the training dataset of the original images. By rotating, flipping, adjusting brightness, contrast, sharpness and introducing interference to ensure that the model can learn as many unrelated patterns as possible during the training process [46], thereby, avoiding over-fitting and achieving better performance. Figure 9 is the comparison diagram of the accuracy and the loss of the models trained with and without data augmentation for the training dataset after transfer learning. Acc_1 and loss_1 depicts the accuracy and the loss of the trained model on the validation dataset with data augmentation for the training dataset separately. Acc_2 and loss_2 are the accuracy and loss values of the model on the validation dataset without data enhancement for the training dataset. Comparative experiments show that the accuracy of the model with data augmentation technology is 6.24% higher than the one without data augmentation technology on the testing dataset. As can be seen from Figure 9, the data augmentation technology effectively makes a more stable training process, reduces over-fitting and makes the model more generalized. |
数据增强可以帮助缓解 CNN 训练阶段的过拟合问题。为了诊断从实际使用具有各种亮度,清晰度和反差的模型的过程中收集的图像的疾病,本文增强的原始图像的训练数据集。通过旋转,翻转,调整亮度,对比度,锐度和引入干扰,以确保该模型可以在训练过程期间[46]如许多无关图案尽可能学习,由此,避免过拟合,实现更好的性能。图9是精度的比较图,具有和不具有用于转移学习之后的训练数据集的数据扩充训练模型的损失。 Acc_1和loss_1描述的准确性和与单独训练数据集的数据增强验证数据集训练模型的损失。 Acc_2和loss_2上没有数据增强了训练数据集验证数据集模型的准确性和损耗值。对比实验表明,在测试数据集上,采用数据增强技术的模型准确率比没有采用数据增强技术的模型高6.24%。 如可从图9可以看出,数据增强技术有效地使更稳定的训练过程中,减少了过拟合和使模型更广义。 |
原文 | 译文 |
---|---|
Figure 10 shows the identification accuracies of XDNet, VGG-INCEP, and five popular CNNs, including MobileNet[47], DenseNet-201, VGG-16[30], Inception-v3, and Xception. These networks are all pre-trained by the subset of PlantVillage dataset, and then the parameters are transferred to the ATLDs recognition task. Figure 10 shows the classification accuracy and convergence rate of the above seven networks and XDNet on the validation dataset. XDNet has proven to have the highest accuracy and quickest convergence than other models in identifying diseases on the apple leaf dataset. | 图 10 显示了 XDNet、VGG-INCEP 和五个流行的 CNN 的识别精度,包括 MobileNet[47]、DenseNet-201、VGG-16[30]、Inception-v3 和 Xception。 这些网络均由 PlantVillage 数据集的子集进行预训练,然后将参数传递给 ATLDs 识别任务。 图 10 展示了上述 7 个网络和 XDNet 在验证数据集上的分类准确率和收敛速度。 XDNet 已被证明在识别苹果叶数据集上的疾病方面比其他模型具有最高的准确度和最快的收敛速度。 |
原文 | 译文 |
---|---|
Figure 10. Classification accuracies of deep convolutional neural networks (DCNNs) and XDNet for the apple tree leaf diseases (ATLDs) task. X axis is the training epoch and Y axis is the classification accuracy of the corresponding network on the validation dataset. | 图 10. 深度卷积神经网络 (DCNN) 和 XDNet 对苹果树叶病害 (ATLD) 任务的分类精度。 X 轴是训练时期,Y 轴是相应网络在验证数据集上的分类精度。 |
Table 4 compares the seven networks with the training time, the amount of network parameters, the best accuracy and average accuracy of cross-validation on the testing dataset. It is concluded that the calculation time of the VGG-16 model is the least, but the amount of training parameters is comparably large and the accuracy is relatively low. The XDNet model has the highest accuracy.Compared with MobileNet, a lightweight network with the lowest number of parameters and training time, XDNet has slightly more parameters and training time, resulting in a higher accuracy. Both VGG-INCEP and XDNet use the model fusion method, but the number of parameters and calculation time of VGG-INCEP are much higher than XDNet, and the accuracy is lower than XDNet. Compared with the DenseNet and Xception model, XDNet not only has much less training time, but also has a much smaller amount of model parameters. It can be seen that compared with other models, we manage to improve the performance of the XDNet model without increasing the amount of model parameters, while maintain the robustness and the efficiency of the model. As shown in Figure 10, the XDNet model has converged after 16 epochs, and it has the best convergence rate compared to other models. In general, the XDNet model uses relatively less calculation time and fewer parameters to obtain better convergence and achieve highest accuracy of ATLDs identification (98.82%) among the compared seven models. | 表 4 比较了 7 个网络在测试数据集上的训练时间、网络参数数量、交叉验证的最佳准确率和平均准确率。结论是VGG-16模型的计算时间最少,但训练参数量比较大,准确率比较低。 XDNet 模型的准确率最高。 与参数数量和训练时间最少的轻量级网络 MobileNet 相比,XDNet 的参数和训练时间略多,因此准确率更高。 VGG-INCEP和XDNet都采用了模型融合的方法,但是VGG-INCEP的参数数量和计算时间比XDNet高很多,准确率比XDNet低。与 DenseNet 和 Xception 模型相比,XDNet 不仅训练时间少得多,而且模型参数量也少得多。可以看出,与其他模型相比,我们设法在不增加模型参数量的情况下提高了 XDNet 模型的性能,同时保持了模型的鲁棒性和效率。如图 10 所示,XDNet 模型在 16 个 epoch 后已经收敛,与其他模型相比,它具有最好的收敛速度。总的来说,XDNet 模型使用相对较少的计算时间和较少的参数来获得更好的收敛性,并在比较的七个模型中实现了最高的 ATLD 识别准确率(98.82%)。 |
原文 | 译文 |
---|---|
Two CNN models (XDNet and Xception) with better accuracies were further tested on our dataset for the investigation of the importance of training images capturing type. This experiment uses the same data augmentation and transfer learning techniques as Section 2.2.1 and Section 4.3.2. The training, validation and test datasets are divided in the proportion of 6:2:2. Two groups of training and validation datasets were divided, and each group only contains the field background or laboratory background images with the same quantity. In the test dataset, the images with the field background and laboratory background account for 50%, respectively. The experimental results are shown in Table 5. The results show that the accuracies of the models trained by laboratory background images is lower than those trained by field background images (about 14%) on the same test dataset. These show that images captured in the natural growing environment enable these DCNN models better accuracies in the actual using scenarios, and prove the importance of the images captured in actual cultivation conditions for the identification of ATLDs. | 在我们的数据集上进一步测试了两个精度更高的 CNN 模型(XDNet 和 Xception),以研究训练图像捕获类型的重要性。本实验使用与第 2.2.1 节和第 4.3.2 节相同的数据增强和迁移学习技术。训练、验证和测试数据集按 6:2:2 的比例划分。分为两组训练和验证数据集,每组仅包含相同数量的现场背景或实验室背景图像。在测试数据集中,具有野外背景和实验室背景的图像分别占50%。实验结果如表 5 所示。 结果表明,在相同的测试数据集上,实验室背景图像训练的模型精度低于现场背景图像训练的模型(约 14%)。这些表明在自然生长环境中捕获的图像使这些 DCNN 模型在实际使用场景中具有更好的准确性,并证明了在实际栽培条件下捕获的图像对 ATLD 识别的重要性。 |
原文 | 译文 |
---|---|
Feature visualization can better help to understand ATLDs and ease the debugging process of the learning model [37]. Figure 11 is the visualization of the convolution kernels of the different layer. Firstly, as shown in Figure 11b, it can be found that the number of features obtained from the shallow convolution layer is big. The shallow feature data are very close to the original image data, which is similar to the results of edge detection. At this stage, the convolution kernels retain most of the image information, which further verifies the correctness of the transfer learning usage. Secondly, the XDNet model has a strong response to the lesion area, as shown in Figure 11b–d. When we go deeper in the network, fewer descriptive features are obtained, instead, the features become more abstract, and more information about the disease category becomes implicitly available [29]. It also proves that dense block has good feature reuse ability in the deeper layers of the network, using dense blocks helps to improve ATLDs identification ability for XDNet. Moreover, Figure 11e is the shallow layer feature map visualization of diseased images of the other four diseases, and it shows that the XDNet model has a strong response to the lesion area. | 特征可视化可以更好地帮助理解 ATLD 并简化学习模型的调试过程 [37]。图11是不同层的卷积核(卷积层)的可视化。首先,如图11b所示,可以发现浅卷积层得到的特征数量很大。浅层特征数据与原始图像数据非常接近,类似于边缘检测的结果。在这个阶段,卷积核保留了大部分图像信息,这进一步验证了迁移学习使用的正确性。其次,XDNet 模型对病变区域有很强的响应,如图 11b-d 所示。当我们在网络中深入时,获得的描述性特征更少,相反,特征变得更加抽象,关于疾病类别的更多信息变得隐式可用 [29]。这也证明了密集块在网络的更深层具有良好的特征重用能力,使用密集块有助于提高 XDNet 的 ATLDs 识别能力。此外,图 11e 是其他四种疾病的病变图像的浅层特征图可视化,它表明 XDNet 模型对病变区域有很强的响应。 |
原文 | 译文 |
---|---|
Applying artificial intelligence to identify ATLDs is helpful to provide ideas for solving the asymmetry of the needs of professional ATLDs identification and the scarcity of expert resources. | 应用人工智能进行ATLD识别,有助于为解决专业ATLD识别需求不对称和专家资源稀缺问题提供思路。 |
Combining the advantages of Xception and DenseNet models, this paper proposes a deep learning network model XDNet with depthwise separable convolutions and densely connected structures for ATLDs recognition. The model can accurately classify five common ATLDs and healthy leaves in the variable shooting conditions, such as different image resolutions, changing lights, contrasts and orientations. The ATLDs dataset we collected contains 2970 images of five common diseases and healthy apple leaves with both laboratory background and complex natural field background. Data augmentation technology and image channel normalization were used to preprocess the dataset, thereby reducing overfitting and enhancing the robustness of the model. | 本文结合 Xception 和 DenseNet 模型的优点,提出了一种深度学习网络模型 XDNet,具有深度可分离卷积和密集连接结构,用于 ATLDs 识别。 该模型可以在不同的图像分辨率、变化的光线、对比度和方向等可变的拍摄条件下准确地分类五种常见的ATLD和健康的叶子。 我们收集的 ATLDs 数据集包含 2970 张具有实验室背景和复杂自然野外背景的五种常见病害和健康苹果叶的图像。 使用数据增强技术和图像通道归一化对数据集进行预处理,从而减少过拟合并增强模型的鲁棒性。 |
The experiment compared XDNet with Inception-v3, MobileNet, VGG-16, DenseNet-201, Xception, and VGG-INCEP. Among them, XDNet has the highest average accuracy, which is 0.58% higher than that of Xception (the second highest average accuracy) after cross-validation five times on our ATLDs dataset. Moreover, XDNet has the best convergence and relatively few parameters. The experimental results show that the deep convolutional neural network is promising for the classification of leaf diseases. | 该实验将 XDNet 与 Inception-v3、MobileNet、VGG-16、DenseNet-201、Xception 和 VGG-INCEP 进行了比较。 其中,XDNet 的平均准确率最高,在我们的 ATLDs 数据集上进行五次交叉验证后,比 Xception(平均准确率第二高)高 0.58%。 而且,XDNet 具有最好的收敛性和相对较少的参数。 实验结果表明,深度卷积神经网络在叶片病害分类方面具有广阔的应用前景。 |
Model fusion is proven to achieve better results in the paper, which is also a promising direction with great potential for future works. Applying data augmentation techniques and transfer learning can improve model performance and get higher recognition accuracy. The images captured in actual cultivation conditions are important for training models. Therefore, more diverse data can be collected in the future, especially from the natural cultivation environment with different light condition, complex backgrounds, etc., to further improve the model performance. Since the number of model parameters of XDNet is small, the trained model can be integrated into mobile applications to provide farmers the expert-level disease diagnostic services. The applications can also use for dynamic monitoring of leaf diseases in the orchard, and then achieve automatic early disease warning and intelligent pesticide prescription. Except the mobile support by keeping the model lightweighted, we also plan to work on the automatic evaluation of disease severity to provide accurate identification and diagnosis for ATLDs on the mobile devices. | 模型融合在论文中被证明可以取得更好的结果,这也是一个很有前途的方向,未来的工作潜力很大。应用数据增强技术和迁移学习可以提高模型性能并获得更高的识别准确率。在实际培养条件下捕获的图像对于训练模型很重要。因此,未来可以收集更多样化的数据,特别是来自不同光照条件、复杂背景等的自然栽培环境,以进一步提高模型性能。由于 XDNet 的模型参数数量少,训练好的模型可以集成到移动应用程序中,为农民提供专家级的疾病诊断服务。该应用还可用于果园叶病动态监测,实现病害自动预警和农药智能配药。除了保持模型轻量化的移动支持外,我们还计划致力于疾病严重程度的自动评估,为移动设备上的 ATLD 提供准确的识别和诊断。 |