嘉宾评论：可持续农业图像和视频处理的新领域

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Image Processing Pub Date : 2025-03-28 DOI:10.1049/ipr2.70032

Davide Moroni, Dimitrios Kosmopoulos

{"title":"嘉宾评论：可持续农业图像和视频处理的新领域","authors":"Davide Moroni, Dimitrios Kosmopoulos","doi":"10.1049/ipr2.70032","DOIUrl":null,"url":null,"abstract":"The rapidly evolving landscape of image processing, with the integration of cutting-edge technologies such as deep learning, has expanded its influence across various sectors. Agriculture, being a pillar of sustainable development, is on the cusp of a major technological transformation, necessitating the synergy of advanced sensors, image processing and machine learning. Recognizing the symbiotic relationship between image processing advancements and the agricultural domain's intrinsic challenges, this special issue aims to bring to the fore the innovative applications of advanced image processing methodologies in agriculture to enable sustainable production. The focus is not only on addressing agricultural challenges but also on unraveling new research trajectories in image processing that could ripple into other sectors like remote sensing, robotics and photogrammetry. The current special issue is aligned with the Sustainable Development Goals outlined in the 2030 agenda for sustainable development. Conversely, the agricultural domain provides a fertile ground for research challenges that motivate the exploration of new avenues.In this Special Issue, we received 25 submissions, all of which underwent a thorough peer-review process. Of these, eight papers were accepted for publication. The high volume of submissions, along with the quality of the accepted papers, highlights the relevance and success of this Special Issue. Six of the eight accepted papers address issues related to common RGB images collected in the field by conventional sensors, showcasing the significant potential of computer vision in monitoring plants, pests and diseases, as well as assessing the quality and maturity of crops. Another paper focuses on advanced transformer-like methods for achieving super-resolution in remote sensing images, which proves particularly beneficial for multi-spectral imaging of crops. An additional paper is dedicated to livestock monitoring, offering quantitative methods for evaluating their impact on climate change.More in detail, Sun and Huo present a robust maize leaf disease identification model aimed at delivering precise recognition of images taken at close range. They introduce an enhanced lightweight network based on the EfficientNet architecture. To this end, the model replaces the squeeze-and-excitation module found in the MBConv component (a typical EfficientNet module) with the convolutional block attention module (CBAM). This modification allows the network to focus on channel-wise correlations while adaptively learning the attentional weight of each spatial location. Moreover, a multi-scale feature fusion layer utilizing residual connections is incorporated to extract more comprehensive and richer disease features across varying scales. Experimental results indicate that this proposed model outperforms several popular architectures. Additionally, the authors provide an analysis of activation maps, which qualitatively demonstrates that the newly implemented model effectively minimizes background interference—a common issue in field-captured images due to factors such as lighting conditions, shooting angles, noise and intricate backgrounds. These challenges generally complicate the extraction of key feature information during model training.Kiratiratanapruk et al. explore a computer vision-based approach to rice disease diagnosis, addressing real-world challenges encountered in rice field imagery. Their study considers critical factors such as environmental conditions and the varying sizes of rice leaves, aspects that are often under-represented in academic research. The proposed method integrates convolutional neural network (CNN) object detection with an image tiling strategy, where the division of images is guided by an automatically estimated rice leaf width. To achieve this, a specialized CNN model is developed using an 18-layer ResNet architecture for leaf width estimation. This model is trained on a newly generated set of tiled sub-images, ensuring that uniformly sized objects are used in training the rice disease prediction model. The study was validated on a dataset of 4960 images, covering eight distinct rice leaf diseases. Notably, the approach proves effective in managing variations in object size, significantly improving the accuracy of disease detection in practical field conditions.Cho et al. explore a data augmentation technique for plant disease classification and early diagnosis, leveraging a generative adversarial network (GAN) to partially mitigate the challenges posed by limited dataset availability in agricultural computer vision. In deep learning–based classification models, data imbalance is a key issue that often hampers performance. To address this, the authors utilize tomato disease images from the publicly available PlantVillage dataset to assess the effectiveness of the GauGAN (Gaussian-GAN) algorithm. Their study highlights the impact of synthetic image generation in enhancing model training. The proposed GauGAN model is employed to generate additional training data for a MobileNet-based classification model. Its performance is then compared against models trained using traditional data augmentation techniques and the cut-mix and mix-up algorithms. Experimental findings, based on F1-scores, indicate that the GauGAN-based augmentation strategy outperforms conventional methods by over 10%, demonstrating its effectiveness in improving classification accuracy.Huo et al. propose an enhanced multi-scale YOLOv8 model for detecting and recognizing dense lesions on apple leaves. The detection of apple leaf lesions presents significant challenges due to the wide range of species, diverse morphologies, varying lesion sizes and complex backgrounds. To address these challenges, the proposed YOLOv8 incorporates an improved C2f-RFEM module in the backbone network, enhancing the extraction of disease-related features. Additionally, a newly designed neck network integrates the C2f-DCN and C2f-DCN-EMA modules, which leverage deformable convolutions and an advanced attention mechanism to refine feature representation. To further enhance the detection of small-scale lesions, the model introduces a high-resolution detection head, improving recognition capabilities across multiple scales. The effectiveness of the improved YOLOv8 is validated using the COCO dataset, which includes 80 object categories, and an apple leaf disease dataset comprising eight disease types. Experimental results show that the enhanced model achieves superior performance in terms of mean average precision (mAP) and floating point operations (FLOPs) compared to baseline YOLOv8, Faster R-CNN, RetinaNet, SSD and YOLOv5s. In terms of parameter count and model size, the improved YOLOv8 gives competitive performance.Du et al. introduce an improved variant of the YOLO family, extending its capabilities with respect to previous papers mentioned above beyond disease detection to include pest identification. Building on YOLOv7, the proposed model incorporates a progressive spatial adaptive feature pyramid (PSAFP) to enhance multi-scale feature representation. Additionally, the authors employ a combination of varifocal loss and rank-based mining loss to refine the object loss calculation, effectively reducing the impact of irrelevant negative samples during training. Evaluations conducted on the filtered PlantVillage dataset and the rice-corn pest dataset demonstrate the effectiveness of YOLOv7-PSAFP, which outperforms the baseline YOLOv7 model.Building on a YOLO-based architecture, Ling et al. explore adaptive object detection with an enhanced model, shifting the focus from threat quantification, as addressed in the works introduced above, to maturity detection in raspberry cultivation. To tackle this challenge, the authors propose HSA-YOLOv5 (HSV Self-Adaptive YOLOv5), a method designed to identify raspberries at different ripeness stages—immature, nearly ripe and fully ripe. The approach involves converting images from the standard RGB colour space to an optimized HSV colour space. The method improves data representation by fine-tuning parameters and enhancing contrast among visually similar hues while preserving key image features. Additionally, adaptive HSV parameter selection is applied based on varying weather conditions, ensuring consistent preprocessing across the dataset. The improved model is evaluated against the standard YOLOv5 using a custom-built dataset. Experimental results indicate that the enhanced approach achieves an mAP of 0.97, marking a 6.42 percentage point improvement over the baseline YOLOv5 model.Rossi et al. explore advanced image processing techniques, leveraging a combination of Swin Transformer and mixture-of-experts models to enhance image quality in remote sensing. Their approach has potential applications in agriculture, particularly for improving the accuracy and usability of multi-spectral imagery. The authors introduce Swin2-MoSE, a novel single-image super-resolution model specifically designed for remote sensing tasks. A key innovation of this model is a newly designed layer that effectively merges outputs from individual experts, alongside the implementation of a per-example strategy—an improvement over the more commonly used per-token approach. Additionally, they investigate the interaction between positional encodings, demonstrating how per-channel and per-head biases can complement each other to improve performance. To further enhance model effectiveness, the study incorporates a loss function that combines normalized cross-correlation (NCC) and structural similarity index measure (SSIM), addressing the known limitations of mean squared error (MSE) loss. Experimental evaluations reveal that Swin2-MoSE outperforms existing Swin-based models, on the Sen2Venµs and OLI2MSI datasets.Finally, Embaby et al. explore the application of imaging technologies in livestock management and sustainability by introducing an optical gas imaging (OGI) and deep learning framework for quantifying enteric methane emissions from rumen fermentation in vitro. To achieve this, the authors propose a novel architecture called Gasformer, designed to enhance the detection and measurement of methane emissions. The model demonstrates competitive performance, which highlights the potential of OGI technology, when combined with advanced semantic segmentation models, to accurately predict and quantify methane emissions in livestock farming. This approach could contribute to the development of effective mitigation strategies aimed at reducing the environmental impact of methane emissions and addressing climate change.We hope that the papers presented in this special issue reflect the significant progress made by modern methods in advancing sustainable agriculture. Thanks to recent advances in image and video processing, these techniques are now widely disseminated and capable of supporting innovative farming strategies worldwide. These developments have the potential to contribute to the Sustainable Development Goals outlined in the 2023 Agenda for Sustainable Development.Davide Moroni: conceptualization, writing – original draft, writing – review & editing. Dimitrios Kosmopoulos: conceptualization, writing – original draft, writing – review & editing.The authors declare no conflicts of interest.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70032","citationCount":"0","resultStr":"{\"title\":\"Guest Editorial: New Frontiers in Image and Video Processing for Sustainable Agriculture\",\"authors\":\"Davide Moroni, Dimitrios Kosmopoulos\",\"doi\":\"10.1049/ipr2.70032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapidly evolving landscape of image processing, with the integration of cutting-edge technologies such as deep learning, has expanded its influence across various sectors. Agriculture, being a pillar of sustainable development, is on the cusp of a major technological transformation, necessitating the synergy of advanced sensors, image processing and machine learning. Recognizing the symbiotic relationship between image processing advancements and the agricultural domain's intrinsic challenges, this special issue aims to bring to the fore the innovative applications of advanced image processing methodologies in agriculture to enable sustainable production. The focus is not only on addressing agricultural challenges but also on unraveling new research trajectories in image processing that could ripple into other sectors like remote sensing, robotics and photogrammetry. The current special issue is aligned with the Sustainable Development Goals outlined in the 2030 agenda for sustainable development. Conversely, the agricultural domain provides a fertile ground for research challenges that motivate the exploration of new avenues.In this Special Issue, we received 25 submissions, all of which underwent a thorough peer-review process. Of these, eight papers were accepted for publication. The high volume of submissions, along with the quality of the accepted papers, highlights the relevance and success of this Special Issue. Six of the eight accepted papers address issues related to common RGB images collected in the field by conventional sensors, showcasing the significant potential of computer vision in monitoring plants, pests and diseases, as well as assessing the quality and maturity of crops. Another paper focuses on advanced transformer-like methods for achieving super-resolution in remote sensing images, which proves particularly beneficial for multi-spectral imaging of crops. An additional paper is dedicated to livestock monitoring, offering quantitative methods for evaluating their impact on climate change.More in detail, Sun and Huo present a robust maize leaf disease identification model aimed at delivering precise recognition of images taken at close range. They introduce an enhanced lightweight network based on the EfficientNet architecture. To this end, the model replaces the squeeze-and-excitation module found in the MBConv component (a typical EfficientNet module) with the convolutional block attention module (CBAM). This modification allows the network to focus on channel-wise correlations while adaptively learning the attentional weight of each spatial location. Moreover, a multi-scale feature fusion layer utilizing residual connections is incorporated to extract more comprehensive and richer disease features across varying scales. Experimental results indicate that this proposed model outperforms several popular architectures. Additionally, the authors provide an analysis of activation maps, which qualitatively demonstrates that the newly implemented model effectively minimizes background interference—a common issue in field-captured images due to factors such as lighting conditions, shooting angles, noise and intricate backgrounds. These challenges generally complicate the extraction of key feature information during model training.Kiratiratanapruk et al. explore a computer vision-based approach to rice disease diagnosis, addressing real-world challenges encountered in rice field imagery. Their study considers critical factors such as environmental conditions and the varying sizes of rice leaves, aspects that are often under-represented in academic research. The proposed method integrates convolutional neural network (CNN) object detection with an image tiling strategy, where the division of images is guided by an automatically estimated rice leaf width. To achieve this, a specialized CNN model is developed using an 18-layer ResNet architecture for leaf width estimation. This model is trained on a newly generated set of tiled sub-images, ensuring that uniformly sized objects are used in training the rice disease prediction model. The study was validated on a dataset of 4960 images, covering eight distinct rice leaf diseases. Notably, the approach proves effective in managing variations in object size, significantly improving the accuracy of disease detection in practical field conditions.Cho et al. explore a data augmentation technique for plant disease classification and early diagnosis, leveraging a generative adversarial network (GAN) to partially mitigate the challenges posed by limited dataset availability in agricultural computer vision. In deep learning–based classification models, data imbalance is a key issue that often hampers performance. To address this, the authors utilize tomato disease images from the publicly available PlantVillage dataset to assess the effectiveness of the GauGAN (Gaussian-GAN) algorithm. Their study highlights the impact of synthetic image generation in enhancing model training. The proposed GauGAN model is employed to generate additional training data for a MobileNet-based classification model. Its performance is then compared against models trained using traditional data augmentation techniques and the cut-mix and mix-up algorithms. Experimental findings, based on F1-scores, indicate that the GauGAN-based augmentation strategy outperforms conventional methods by over 10%, demonstrating its effectiveness in improving classification accuracy.Huo et al. propose an enhanced multi-scale YOLOv8 model for detecting and recognizing dense lesions on apple leaves. The detection of apple leaf lesions presents significant challenges due to the wide range of species, diverse morphologies, varying lesion sizes and complex backgrounds. To address these challenges, the proposed YOLOv8 incorporates an improved C2f-RFEM module in the backbone network, enhancing the extraction of disease-related features. Additionally, a newly designed neck network integrates the C2f-DCN and C2f-DCN-EMA modules, which leverage deformable convolutions and an advanced attention mechanism to refine feature representation. To further enhance the detection of small-scale lesions, the model introduces a high-resolution detection head, improving recognition capabilities across multiple scales. The effectiveness of the improved YOLOv8 is validated using the COCO dataset, which includes 80 object categories, and an apple leaf disease dataset comprising eight disease types. Experimental results show that the enhanced model achieves superior performance in terms of mean average precision (mAP) and floating point operations (FLOPs) compared to baseline YOLOv8, Faster R-CNN, RetinaNet, SSD and YOLOv5s. In terms of parameter count and model size, the improved YOLOv8 gives competitive performance.Du et al. introduce an improved variant of the YOLO family, extending its capabilities with respect to previous papers mentioned above beyond disease detection to include pest identification. Building on YOLOv7, the proposed model incorporates a progressive spatial adaptive feature pyramid (PSAFP) to enhance multi-scale feature representation. Additionally, the authors employ a combination of varifocal loss and rank-based mining loss to refine the object loss calculation, effectively reducing the impact of irrelevant negative samples during training. Evaluations conducted on the filtered PlantVillage dataset and the rice-corn pest dataset demonstrate the effectiveness of YOLOv7-PSAFP, which outperforms the baseline YOLOv7 model.Building on a YOLO-based architecture, Ling et al. explore adaptive object detection with an enhanced model, shifting the focus from threat quantification, as addressed in the works introduced above, to maturity detection in raspberry cultivation. To tackle this challenge, the authors propose HSA-YOLOv5 (HSV Self-Adaptive YOLOv5), a method designed to identify raspberries at different ripeness stages—immature, nearly ripe and fully ripe. The approach involves converting images from the standard RGB colour space to an optimized HSV colour space. The method improves data representation by fine-tuning parameters and enhancing contrast among visually similar hues while preserving key image features. Additionally, adaptive HSV parameter selection is applied based on varying weather conditions, ensuring consistent preprocessing across the dataset. The improved model is evaluated against the standard YOLOv5 using a custom-built dataset. Experimental results indicate that the enhanced approach achieves an mAP of 0.97, marking a 6.42 percentage point improvement over the baseline YOLOv5 model.Rossi et al. explore advanced image processing techniques, leveraging a combination of Swin Transformer and mixture-of-experts models to enhance image quality in remote sensing. Their approach has potential applications in agriculture, particularly for improving the accuracy and usability of multi-spectral imagery. The authors introduce Swin2-MoSE, a novel single-image super-resolution model specifically designed for remote sensing tasks. A key innovation of this model is a newly designed layer that effectively merges outputs from individual experts, alongside the implementation of a per-example strategy—an improvement over the more commonly used per-token approach. Additionally, they investigate the interaction between positional encodings, demonstrating how per-channel and per-head biases can complement each other to improve performance. To further enhance model effectiveness, the study incorporates a loss function that combines normalized cross-correlation (NCC) and structural similarity index measure (SSIM), addressing the known limitations of mean squared error (MSE) loss. Experimental evaluations reveal that Swin2-MoSE outperforms existing Swin-based models, on the Sen2Venµs and OLI2MSI datasets.Finally, Embaby et al. explore the application of imaging technologies in livestock management and sustainability by introducing an optical gas imaging (OGI) and deep learning framework for quantifying enteric methane emissions from rumen fermentation in vitro. To achieve this, the authors propose a novel architecture called Gasformer, designed to enhance the detection and measurement of methane emissions. The model demonstrates competitive performance, which highlights the potential of OGI technology, when combined with advanced semantic segmentation models, to accurately predict and quantify methane emissions in livestock farming. This approach could contribute to the development of effective mitigation strategies aimed at reducing the environmental impact of methane emissions and addressing climate change.We hope that the papers presented in this special issue reflect the significant progress made by modern methods in advancing sustainable agriculture. Thanks to recent advances in image and video processing, these techniques are now widely disseminated and capable of supporting innovative farming strategies worldwide. These developments have the potential to contribute to the Sustainable Development Goals outlined in the 2023 Agenda for Sustainable Development.Davide Moroni: conceptualization, writing – original draft, writing – review & editing. Dimitrios Kosmopoulos: conceptualization, writing – original draft, writing – review & editing.The authors declare no conflicts of interest.\",\"PeriodicalId\":56303,\"journal\":{\"name\":\"IET Image Processing\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-03-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70032\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Image Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70032\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70032","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

随着深度学习等前沿技术的融入，图像处理技术的发展日新月异，其影响力已扩展到各个领域。农业作为可持续发展的支柱，正处于重大技术变革的风口浪尖，需要先进传感器、图像处理和机器学习的协同作用。认识到图像处理的进步与农业领域的内在挑战之间的共生关系，本特刊旨在突出先进图像处理方法在农业领域的创新应用，以实现可持续生产。重点不仅在于应对农业领域的挑战，还在于揭示图像处理领域的新研究轨迹，这些轨迹可能会影响到遥感、机器人和摄影测量等其他领域。本期特刊与 2030 年可持续发展议程中概述的可持续发展目标相一致。反过来，农业领域也为研究挑战提供了肥沃的土壤，激励着人们探索新的途径。在本期特刊中，我们共收到 25 篇来稿，所有来稿都经过了严格的同行评审，其中 8 篇论文被接受发表。投稿数量之多，录用论文质量之高，凸显了本特刊的现实意义和成功之处。在八篇录用论文中，有六篇论述了与传统传感器在田间采集的普通 RGB 图像相关的问题，展示了计算机视觉在监测植物、病虫害以及评估作物质量和成熟度方面的巨大潜力。另一篇论文重点介绍了在遥感图像中实现超分辨率的先进变压器类方法，事实证明这对农作物的多光谱成像特别有益。更详细地说，Sun 和 Huo 提出了一种稳健的玉米叶病识别模型，旨在对近距离拍摄的图像进行精确识别。他们引入了基于 EfficientNet 架构的增强型轻量级网络。为此，该模型用卷积块注意力模块（CBAM）取代了 MBConv 组件（典型的 EfficientNet 模块）中的挤压-激发模块。这一修改使网络在自适应学习每个空间位置的注意力权重的同时，还能关注信道相关性。此外，还加入了一个利用残差连接的多尺度特征融合层，以提取不同尺度上更全面、更丰富的疾病特征。实验结果表明，所提出的模型优于几种流行的架构。此外，作者还提供了对激活图的分析，定性地证明了新实施的模型能有效地将背景干扰降到最低，而背景干扰是现场捕捉图像中常见的问题，其原因包括照明条件、拍摄角度、噪声和错综复杂的背景等因素。Kiratiratanapruk 等人探索了一种基于计算机视觉的水稻病害诊断方法，解决了水稻田图像中遇到的现实挑战。他们的研究考虑到了环境条件和水稻叶片大小不一等关键因素，而这些因素在学术研究中往往体现不足。所提出的方法将卷积神经网络（CNN）对象检测与图像平铺策略相结合，其中图像的分割由自动估算的水稻叶片宽度引导。为此，利用 18 层 ResNet 架构开发了一个专门的 CNN 模型，用于估算叶宽。该模型在新生成的平铺子图像集上进行训练，确保在训练水稻病害预测模型时使用大小一致的对象。这项研究在一个包含 4960 幅图像的数据集上进行了验证，涵盖了八种不同的水稻叶病。Cho 等人利用生成式对抗网络（GAN）探索了一种用于植物病害分类和早期诊断的数据增强技术，以部分缓解农业计算机视觉中数据集可用性有限所带来的挑战。在基于深度学习的分类模型中，数据不平衡是一个关键问题，往往会影响性能。为了解决这个问题，作者利用公开可用的 PlantVillage 数据集中的番茄疾病图像来评估 GauGAN（高斯对抗网络）算法的有效性。他们的研究强调了合成图像生成对加强模型训练的影响。提议的 GauGAN 模型用于为基于 MobileNet 的分类模型生成额外的训练数据。然后，将其性能与使用传统数据增强技术以及剪切混合和混合算法训练的模型进行比较。基于 F1 分数的实验结果表明，基于 GauGAN 的增强策略比传统方法优胜 10%以上，证明了其在提高分类准确性方面的有效性。由于苹果品种繁多、形态各异、病变大小不一且背景复杂，苹果叶片病变的检测面临着巨大挑战。为了应对这些挑战，所提出的 YOLOv8 在骨干网络中加入了改进的 C2f-RFEM 模块，增强了对病害相关特征的提取。此外，新设计的颈部网络集成了 C2f-DCN 和 C2f-DCN-EMA 模块，利用可变形卷积和先进的注意力机制来完善特征表示。为了进一步提高对小范围病变的检测能力，该模型引入了高分辨率检测头，从而提高了多尺度的识别能力。改进后的 YOLOv8 的有效性通过 COCO 数据集和苹果叶病数据集进行了验证，前者包括 80 个对象类别，后者包括 8 种疾病类型。实验结果表明，与基线 YOLOv8、Faster R-CNN、RetinaNet、SSD 和 YOLOv5s 相比，改进后的模型在平均精度（mAP）和浮点运算（FLOPs）方面性能更优。在参数数量和模型大小方面，改进后的 YOLOv8 提供了具有竞争力的性能。Du 等人介绍了 YOLO 系列的改进变体，与上述前几篇论文相比，将其功能从疾病检测扩展到害虫识别。在 YOLOv7 的基础上，提出的模型采用了渐进式空间自适应特征金字塔 (PSAFP)，以增强多尺度特征表示。此外，作者还结合使用了变焦损失和基于等级的挖掘损失来完善对象损失计算，从而有效减少了训练过程中无关负样本的影响。在经过过滤的 PlantVillage 数据集和水稻-玉米害虫数据集上进行的评估证明了 YOLOv7-PSAFP 的有效性，它优于基线 YOLOv7 模型。在基于 YOLO 的架构基础上，Ling 等人利用增强型模型探索了自适应对象检测，将重点从上述作品中涉及的威胁量化转移到树莓种植中的成熟度检测。为了应对这一挑战，作者提出了 HSA-YOLOv5（HSV 自适应 YOLOv5），这种方法旨在识别不同成熟阶段的树莓--成熟、接近成熟和完全成熟。该方法涉及将图像从标准 RGB 色彩空间转换到优化的 HSV 色彩空间。该方法通过微调参数和增强视觉相似色调之间的对比度来改进数据表示，同时保留关键的图像特征。此外，还根据不同的天气条件进行自适应 HSV 参数选择，确保整个数据集的预处理保持一致。我们使用定制数据集对改进后的模型与标准 YOLOv5 进行了评估。实验结果表明，增强型方法的 mAP 值为 0.97，比基准 YOLOv5 模型提高了 6.42 个百分点。Rossi 等人探索了先进的图像处理技术，利用斯温变换器和专家混合模型的组合来提高遥感图像质量。他们的方法有可能应用于农业领域，特别是提高多光谱图像的准确性和可用性。作者介绍了 Swin2-MoSE，这是一种专为遥感任务设计的新型单图像超分辨率模型。该模型的一个关键创新点是新设计了一个层，它能有效地合并来自各个专家的输出，同时还实施了按实例计算的策略--这是对更常用的按标记计算方法的改进。此外，他们还研究了位置编码之间的相互作用，展示了按通道和按头偏置如何相互补充以提高性能。为了进一步提高模型的有效性，该研究结合了归一化交叉相关（NCC）和结构相似性指数测量（SSIM）的损失函数，解决了已知的均方误差（MSE）损失的局限性。实验评估表明，在 Sen2Venµs 和 OLI2MSI 数据集上，Swin2-MoSE 优于现有的基于 Swin 的模型。通过引入光学气体成像（OGI）和深度学习框架来量化瘤胃体外发酵产生的肠道甲烷排放，探索成像技术在家畜管理和可持续发展中的应用。为此，作者提出了一种名为 Gasformer 的新型架构，旨在加强甲烷排放的检测和测量。该模型表现出了极具竞争力的性能，凸显了 OGI 技术与先进的语义分割模型相结合，准确预测和量化畜牧业甲烷排放的潜力。这种方法有助于制定有效的缓解战略，以减少甲烷排放对环境的影响并应对气候变化。我们希望本特刊中的论文能反映出现代方法在推进可持续农业方面取得的重大进展。得益于图像和视频处理技术的最新进展，这些技术现已得到广泛传播，能够为全球创新农业战略提供支持。这些发展有可能有助于实现《2023 年可持续发展议程》中概述的可持续发展目标。Dimitrios Kosmopoulos：构思、写作--原稿、写作--审阅和编辑。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Guest Editorial: New Frontiers in Image and Video Processing for Sustainable Agriculture

The rapidly evolving landscape of image processing, with the integration of cutting-edge technologies such as deep learning, has expanded its influence across various sectors. Agriculture, being a pillar of sustainable development, is on the cusp of a major technological transformation, necessitating the synergy of advanced sensors, image processing and machine learning. Recognizing the symbiotic relationship between image processing advancements and the agricultural domain's intrinsic challenges, this special issue aims to bring to the fore the innovative applications of advanced image processing methodologies in agriculture to enable sustainable production. The focus is not only on addressing agricultural challenges but also on unraveling new research trajectories in image processing that could ripple into other sectors like remote sensing, robotics and photogrammetry. The current special issue is aligned with the Sustainable Development Goals outlined in the 2030 agenda for sustainable development. Conversely, the agricultural domain provides a fertile ground for research challenges that motivate the exploration of new avenues.

In this Special Issue, we received 25 submissions, all of which underwent a thorough peer-review process. Of these, eight papers were accepted for publication. The high volume of submissions, along with the quality of the accepted papers, highlights the relevance and success of this Special Issue. Six of the eight accepted papers address issues related to common RGB images collected in the field by conventional sensors, showcasing the significant potential of computer vision in monitoring plants, pests and diseases, as well as assessing the quality and maturity of crops. Another paper focuses on advanced transformer-like methods for achieving super-resolution in remote sensing images, which proves particularly beneficial for multi-spectral imaging of crops. An additional paper is dedicated to livestock monitoring, offering quantitative methods for evaluating their impact on climate change.

More in detail, Sun and Huo present a robust maize leaf disease identification model aimed at delivering precise recognition of images taken at close range. They introduce an enhanced lightweight network based on the EfficientNet architecture. To this end, the model replaces the squeeze-and-excitation module found in the MBConv component (a typical EfficientNet module) with the convolutional block attention module (CBAM). This modification allows the network to focus on channel-wise correlations while adaptively learning the attentional weight of each spatial location. Moreover, a multi-scale feature fusion layer utilizing residual connections is incorporated to extract more comprehensive and richer disease features across varying scales. Experimental results indicate that this proposed model outperforms several popular architectures. Additionally, the authors provide an analysis of activation maps, which qualitatively demonstrates that the newly implemented model effectively minimizes background interference—a common issue in field-captured images due to factors such as lighting conditions, shooting angles, noise and intricate backgrounds. These challenges generally complicate the extraction of key feature information during model training.

Kiratiratanapruk et al. explore a computer vision-based approach to rice disease diagnosis, addressing real-world challenges encountered in rice field imagery. Their study considers critical factors such as environmental conditions and the varying sizes of rice leaves, aspects that are often under-represented in academic research. The proposed method integrates convolutional neural network (CNN) object detection with an image tiling strategy, where the division of images is guided by an automatically estimated rice leaf width. To achieve this, a specialized CNN model is developed using an 18-layer ResNet architecture for leaf width estimation. This model is trained on a newly generated set of tiled sub-images, ensuring that uniformly sized objects are used in training the rice disease prediction model. The study was validated on a dataset of 4960 images, covering eight distinct rice leaf diseases. Notably, the approach proves effective in managing variations in object size, significantly improving the accuracy of disease detection in practical field conditions.

Cho et al. explore a data augmentation technique for plant disease classification and early diagnosis, leveraging a generative adversarial network (GAN) to partially mitigate the challenges posed by limited dataset availability in agricultural computer vision. In deep learning–based classification models, data imbalance is a key issue that often hampers performance. To address this, the authors utilize tomato disease images from the publicly available PlantVillage dataset to assess the effectiveness of the GauGAN (Gaussian-GAN) algorithm. Their study highlights the impact of synthetic image generation in enhancing model training. The proposed GauGAN model is employed to generate additional training data for a MobileNet-based classification model. Its performance is then compared against models trained using traditional data augmentation techniques and the cut-mix and mix-up algorithms. Experimental findings, based on F1-scores, indicate that the GauGAN-based augmentation strategy outperforms conventional methods by over 10%, demonstrating its effectiveness in improving classification accuracy.

Huo et al. propose an enhanced multi-scale YOLOv8 model for detecting and recognizing dense lesions on apple leaves. The detection of apple leaf lesions presents significant challenges due to the wide range of species, diverse morphologies, varying lesion sizes and complex backgrounds. To address these challenges, the proposed YOLOv8 incorporates an improved C2f-RFEM module in the backbone network, enhancing the extraction of disease-related features. Additionally, a newly designed neck network integrates the C2f-DCN and C2f-DCN-EMA modules, which leverage deformable convolutions and an advanced attention mechanism to refine feature representation. To further enhance the detection of small-scale lesions, the model introduces a high-resolution detection head, improving recognition capabilities across multiple scales. The effectiveness of the improved YOLOv8 is validated using the COCO dataset, which includes 80 object categories, and an apple leaf disease dataset comprising eight disease types. Experimental results show that the enhanced model achieves superior performance in terms of mean average precision (mAP) and floating point operations (FLOPs) compared to baseline YOLOv8, Faster R-CNN, RetinaNet, SSD and YOLOv5s. In terms of parameter count and model size, the improved YOLOv8 gives competitive performance.

Du et al. introduce an improved variant of the YOLO family, extending its capabilities with respect to previous papers mentioned above beyond disease detection to include pest identification. Building on YOLOv7, the proposed model incorporates a progressive spatial adaptive feature pyramid (PSAFP) to enhance multi-scale feature representation. Additionally, the authors employ a combination of varifocal loss and rank-based mining loss to refine the object loss calculation, effectively reducing the impact of irrelevant negative samples during training. Evaluations conducted on the filtered PlantVillage dataset and the rice-corn pest dataset demonstrate the effectiveness of YOLOv7-PSAFP, which outperforms the baseline YOLOv7 model.

Building on a YOLO-based architecture, Ling et al. explore adaptive object detection with an enhanced model, shifting the focus from threat quantification, as addressed in the works introduced above, to maturity detection in raspberry cultivation. To tackle this challenge, the authors propose HSA-YOLOv5 (HSV Self-Adaptive YOLOv5), a method designed to identify raspberries at different ripeness stages—immature, nearly ripe and fully ripe. The approach involves converting images from the standard RGB colour space to an optimized HSV colour space. The method improves data representation by fine-tuning parameters and enhancing contrast among visually similar hues while preserving key image features. Additionally, adaptive HSV parameter selection is applied based on varying weather conditions, ensuring consistent preprocessing across the dataset. The improved model is evaluated against the standard YOLOv5 using a custom-built dataset. Experimental results indicate that the enhanced approach achieves an mAP of 0.97, marking a 6.42 percentage point improvement over the baseline YOLOv5 model.

Rossi et al. explore advanced image processing techniques, leveraging a combination of Swin Transformer and mixture-of-experts models to enhance image quality in remote sensing. Their approach has potential applications in agriculture, particularly for improving the accuracy and usability of multi-spectral imagery. The authors introduce Swin2-MoSE, a novel single-image super-resolution model specifically designed for remote sensing tasks. A key innovation of this model is a newly designed layer that effectively merges outputs from individual experts, alongside the implementation of a per-example strategy—an improvement over the more commonly used per-token approach. Additionally, they investigate the interaction between positional encodings, demonstrating how per-channel and per-head biases can complement each other to improve performance. To further enhance model effectiveness, the study incorporates a loss function that combines normalized cross-correlation (NCC) and structural similarity index measure (SSIM), addressing the known limitations of mean squared error (MSE) loss. Experimental evaluations reveal that Swin2-MoSE outperforms existing Swin-based models, on the Sen2Venµs and OLI2MSI datasets.

Finally, Embaby et al. explore the application of imaging technologies in livestock management and sustainability by introducing an optical gas imaging (OGI) and deep learning framework for quantifying enteric methane emissions from rumen fermentation in vitro. To achieve this, the authors propose a novel architecture called Gasformer, designed to enhance the detection and measurement of methane emissions. The model demonstrates competitive performance, which highlights the potential of OGI technology, when combined with advanced semantic segmentation models, to accurately predict and quantify methane emissions in livestock farming. This approach could contribute to the development of effective mitigation strategies aimed at reducing the environmental impact of methane emissions and addressing climate change.

We hope that the papers presented in this special issue reflect the significant progress made by modern methods in advancing sustainable agriculture. Thanks to recent advances in image and video processing, these techniques are now widely disseminated and capable of supporting innovative farming strategies worldwide. These developments have the potential to contribute to the Sustainable Development Goals outlined in the 2023 Agenda for Sustainable Development.

Davide Moroni: conceptualization, writing – original draft, writing – review & editing. Dimitrios Kosmopoulos: conceptualization, writing – original draft, writing – review & editing.

The authors declare no conflicts of interest.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IET Image Processing 工程技术-工程：电子与电气

CiteScore

5.40

自引率

8.70%

发文量

282

审稿时长

6 months

期刊介绍： The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications. Principal topics include: Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality. Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing. Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing. Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video. Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography. Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security. Current Special Issue Call for Papers: Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf