Journal of Integrative Agriculture ›› 2024, Vol. 23 ›› Issue (3): 901-922.DOI: 10.1016/j.jia.2023.06.023

• • 上一篇    下一篇

基于三重并行注意力的改进多尺度逆瓶颈残差网络苹果叶片病害识别

  

  • 收稿日期:2023-01-06 接受日期:2023-04-12 出版日期:2024-03-20 发布日期:2024-03-02

Improved multi-scale inverse bottleneck residual network based on triplet parallel attention for apple leaf disease identification

Lei Tang1, Jizheng Yi1, 2#, Xiaoyao Li1   

  1. 1 College of Computer & Information Engineering, Central South University of Forestry and Technology, Changsha 410004, China

    2 Yuelushan Laboratory Carbon Sinks Forests Variety Innovation Center, Changsha 410000, China

  • Received:2023-01-06 Accepted:2023-04-12 Online:2024-03-20 Published:2024-03-02
  • About author:#Correspondence Jizheng Yi, E-mail: kingkong148@163.com
  • Supported by:
    This work was supported in part by the General Program Hunan Provincial Natural Science Foundation of 2022, China (2022JJ31022), the Undergraduate Education Reform Project of Hunan Province, China (HNJG-2021-0532), and the National Natural Science Foundation of China (62276276).

摘要:

准确诊断苹果叶片病害对于提高苹果产量质量,促进苹果产业发展有着重要意义。但是不同苹果叶片病害在图像纹理和结构信息上差异不明显,且在复杂背景下病害特征提取存在困难,使得此类研究进展缓慢。针对上述问题,本文在ResNet-50的基础上,提出了一种基于三重并行注意机制的改进型多尺度逆瓶颈残差网络模型,同时改进并结合了Inception模块和ResNext逆瓶颈块,实现了七类苹果叶片识别(包括Alternaria_Boltch,Brown_Spot,Grey_spot, Mosaic,Rust, Scab六种病害和一种Healthy)。首先,用多尺度残差卷积代替部分残差模块中的3×3卷积,多尺度卷积的每个分支所包含的不同大小卷积核都应用于提取一部分特征图,最后这些卷积分支的输出通过相加进行多尺度融合,以丰富图像的输出特征。其次,采用全局逆瓶颈结构来减少网络特征损失,逆瓶颈结构让图像信息在不同维度特征空间之间转换时损失更小。多尺度和逆瓶颈的融合能够让模型在精细识别苹果叶片病害和面对复杂识别背景下以横向和纵向相结合的特点获得更高的鲁棒性。最后,在每一个改进的模块后,集成了三重并行注意力模块,通过旋转和残差变换实现通道间的跨维互动,在维度依赖性得到改善的同时,以相对较小的计算成本提高了重要特征的并行搜索效率和网络的识别率。为了验证本文模型的有效性,我们对从植物村、百度飞桨和互联网的公共数据集中筛选出来的苹果叶病图像进行了统一增强。最终处理后的图像数量为14000张。对处理后的数据集进行了消融研究、预处理比较和方法比较。实验结果表明,所提出的方法在所采用的数据集上达到98.73%的准确率,比经典的ResNet-50模型高1.82%,比预处理前的苹果叶病数据集好0.29%。与一些最先进的方法相比,它在苹果叶病识别方面也取得了有竞争力的结果。

Abstract: Accurate diagnosis of apple leaf diseases is crucial for improving the quality of apple production and promoting the development of the apple industry.  However, apple leaf diseases do not differ significantly from image texture and structural information.  The difficulties in disease feature extraction in complex backgrounds slow the related research progress.  To address the problems, this paper proposes an improved multi-scale inverse bottleneck residual network model based on a triplet parallel attention mechanism, which is built upon ResNet-50, while improving and combining the inception module and ResNext inverse bottleneck blocks, to recognize seven types of apple leaf (including six diseases of alternaria leaf spot, brown spot, grey spot, mosaic, rust, scab, and one healthy).  First, the 3×3 convolutions in some of the residual modules are replaced by multi-scale residual convolutions, the convolution kernels of different sizes contained in each branch of the multi-scale convolution are applied to extract feature maps of different sizes, and the outputs of these branches are multi-scale fused by summing to enrich the output features of the images.  Second, the global layer-wise dynamic coordinated inverse bottleneck structure is used to reduce the network feature loss.  The inverse bottleneck structure makes the image information less lossy when transforming from different dimensional feature spaces.  The fusion of multi-scale and layer-wise dynamic coordinated inverse bottlenecks makes the model effectively balances computational efficiency and feature representation capability, and more robust with a combination of horizontal and vertical features in the fine identification of apple leaf diseases.  Finally, after each improved module, a triplet parallel attention module is integrated with cross-dimensional interactions among channels through rotations and residual transformations, which improves the parallel search efficiency of important features and the recognition rate of the network with relatively small computational costs while the dimensional dependencies are improved.  To verify the validity of the model in this paper, we uniformly enhance apple leaf disease images screened from the public data sets of Plant Village, Baidu Flying Paddle, and the Internet.  The final processed image count is 14,000.  The ablation study, pre-processing comparison, and method comparison are conducted on the processed datasets.  The experimental results demonstrate that the proposed method reaches 98.73% accuracy on the adopted datasets, which is 1.82% higher than the classical ResNet-50 model, and 0.29% better than the apple leaf disease datasets before preprocessing.  It also achieves competitive results in apple leaf disease identification compared to some state-of-the-art methods.

Key words: multi-scale module ,  inverse bottleneck structure ,  triplet parallel attention ,  apple leaf disease