{"title":"嘉宾评论:可持续农业图像和视频处理的新领域","authors":"Davide Moroni, Dimitrios Kosmopoulos","doi":"10.1049/ipr2.70032","DOIUrl":null,"url":null,"abstract":"<p>The rapidly evolving landscape of image processing, with the integration of cutting-edge technologies such as deep learning, has expanded its influence across various sectors. Agriculture, being a pillar of sustainable development, is on the cusp of a major technological transformation, necessitating the synergy of advanced sensors, image processing and machine learning. Recognizing the symbiotic relationship between image processing advancements and the agricultural domain's intrinsic challenges, this special issue aims to bring to the fore the innovative applications of advanced image processing methodologies in agriculture to enable sustainable production. The focus is not only on addressing agricultural challenges but also on unraveling new research trajectories in image processing that could ripple into other sectors like remote sensing, robotics and photogrammetry. The current special issue is aligned with the Sustainable Development Goals outlined in the 2030 agenda for sustainable development. Conversely, the agricultural domain provides a fertile ground for research challenges that motivate the exploration of new avenues.</p><p>In this Special Issue, we received 25 submissions, all of which underwent a thorough peer-review process. Of these, eight papers were accepted for publication. The high volume of submissions, along with the quality of the accepted papers, highlights the relevance and success of this Special Issue. Six of the eight accepted papers address issues related to common RGB images collected in the field by conventional sensors, showcasing the significant potential of computer vision in monitoring plants, pests and diseases, as well as assessing the quality and maturity of crops. Another paper focuses on advanced transformer-like methods for achieving super-resolution in remote sensing images, which proves particularly beneficial for multi-spectral imaging of crops. An additional paper is dedicated to livestock monitoring, offering quantitative methods for evaluating their impact on climate change.</p><p>More in detail, Sun and Huo present a robust maize leaf disease identification model aimed at delivering precise recognition of images taken at close range. They introduce an enhanced lightweight network based on the EfficientNet architecture. To this end, the model replaces the squeeze-and-excitation module found in the MBConv component (a typical EfficientNet module) with the convolutional block attention module (CBAM). This modification allows the network to focus on channel-wise correlations while adaptively learning the attentional weight of each spatial location. Moreover, a multi-scale feature fusion layer utilizing residual connections is incorporated to extract more comprehensive and richer disease features across varying scales. Experimental results indicate that this proposed model outperforms several popular architectures. Additionally, the authors provide an analysis of activation maps, which qualitatively demonstrates that the newly implemented model effectively minimizes background interference—a common issue in field-captured images due to factors such as lighting conditions, shooting angles, noise and intricate backgrounds. These challenges generally complicate the extraction of key feature information during model training.</p><p>Kiratiratanapruk et al. explore a computer vision-based approach to rice disease diagnosis, addressing real-world challenges encountered in rice field imagery. Their study considers critical factors such as environmental conditions and the varying sizes of rice leaves, aspects that are often under-represented in academic research. The proposed method integrates convolutional neural network (CNN) object detection with an image tiling strategy, where the division of images is guided by an automatically estimated rice leaf width. To achieve this, a specialized CNN model is developed using an 18-layer ResNet architecture for leaf width estimation. This model is trained on a newly generated set of tiled sub-images, ensuring that uniformly sized objects are used in training the rice disease prediction model. The study was validated on a dataset of 4960 images, covering eight distinct rice leaf diseases. Notably, the approach proves effective in managing variations in object size, significantly improving the accuracy of disease detection in practical field conditions.</p><p>Cho et al. explore a data augmentation technique for plant disease classification and early diagnosis, leveraging a generative adversarial network (GAN) to partially mitigate the challenges posed by limited dataset availability in agricultural computer vision. In deep learning–based classification models, data imbalance is a key issue that often hampers performance. To address this, the authors utilize tomato disease images from the publicly available PlantVillage dataset to assess the effectiveness of the GauGAN (Gaussian-GAN) algorithm. Their study highlights the impact of synthetic image generation in enhancing model training. The proposed GauGAN model is employed to generate additional training data for a MobileNet-based classification model. Its performance is then compared against models trained using traditional data augmentation techniques and the cut-mix and mix-up algorithms. Experimental findings, based on F1-scores, indicate that the GauGAN-based augmentation strategy outperforms conventional methods by over 10%, demonstrating its effectiveness in improving classification accuracy.</p><p>Huo et al. propose an enhanced multi-scale YOLOv8 model for detecting and recognizing dense lesions on apple leaves. The detection of apple leaf lesions presents significant challenges due to the wide range of species, diverse morphologies, varying lesion sizes and complex backgrounds. To address these challenges, the proposed YOLOv8 incorporates an improved C2f-RFEM module in the backbone network, enhancing the extraction of disease-related features. Additionally, a newly designed neck network integrates the C2f-DCN and C2f-DCN-EMA modules, which leverage deformable convolutions and an advanced attention mechanism to refine feature representation. To further enhance the detection of small-scale lesions, the model introduces a high-resolution detection head, improving recognition capabilities across multiple scales. The effectiveness of the improved YOLOv8 is validated using the COCO dataset, which includes 80 object categories, and an apple leaf disease dataset comprising eight disease types. Experimental results show that the enhanced model achieves superior performance in terms of mean average precision (mAP) and floating point operations (FLOPs) compared to baseline YOLOv8, Faster R-CNN, RetinaNet, SSD and YOLOv5s. In terms of parameter count and model size, the improved YOLOv8 gives competitive performance.</p><p>Du et al. introduce an improved variant of the YOLO family, extending its capabilities with respect to previous papers mentioned above beyond disease detection to include pest identification. Building on YOLOv7, the proposed model incorporates a progressive spatial adaptive feature pyramid (PSAFP) to enhance multi-scale feature representation. Additionally, the authors employ a combination of varifocal loss and rank-based mining loss to refine the object loss calculation, effectively reducing the impact of irrelevant negative samples during training. Evaluations conducted on the filtered PlantVillage dataset and the rice-corn pest dataset demonstrate the effectiveness of YOLOv7-PSAFP, which outperforms the baseline YOLOv7 model.</p><p>Building on a YOLO-based architecture, Ling et al. explore adaptive object detection with an enhanced model, shifting the focus from threat quantification, as addressed in the works introduced above, to maturity detection in raspberry cultivation. To tackle this challenge, the authors propose HSA-YOLOv5 (HSV Self-Adaptive YOLOv5), a method designed to identify raspberries at different ripeness stages—immature, nearly ripe and fully ripe. The approach involves converting images from the standard RGB colour space to an optimized HSV colour space. The method improves data representation by fine-tuning parameters and enhancing contrast among visually similar hues while preserving key image features. Additionally, adaptive HSV parameter selection is applied based on varying weather conditions, ensuring consistent preprocessing across the dataset. The improved model is evaluated against the standard YOLOv5 using a custom-built dataset. Experimental results indicate that the enhanced approach achieves an mAP of 0.97, marking a 6.42 percentage point improvement over the baseline YOLOv5 model.</p><p>Rossi et al. explore advanced image processing techniques, leveraging a combination of Swin Transformer and mixture-of-experts models to enhance image quality in remote sensing. Their approach has potential applications in agriculture, particularly for improving the accuracy and usability of multi-spectral imagery. The authors introduce Swin2-MoSE, a novel single-image super-resolution model specifically designed for remote sensing tasks. A key innovation of this model is a newly designed layer that effectively merges outputs from individual experts, alongside the implementation of a per-example strategy—an improvement over the more commonly used per-token approach. Additionally, they investigate the interaction between positional encodings, demonstrating how per-channel and per-head biases can complement each other to improve performance. To further enhance model effectiveness, the study incorporates a loss function that combines normalized cross-correlation (NCC) and structural similarity index measure (SSIM), addressing the known limitations of mean squared error (MSE) loss. Experimental evaluations reveal that Swin2-MoSE outperforms existing Swin-based models, on the Sen2Venµs and OLI2MSI datasets.</p><p>Finally, Embaby et al. explore the application of imaging technologies in livestock management and sustainability by introducing an optical gas imaging (OGI) and deep learning framework for quantifying enteric methane emissions from rumen fermentation in vitro. To achieve this, the authors propose a novel architecture called Gasformer, designed to enhance the detection and measurement of methane emissions. The model demonstrates competitive performance, which highlights the potential of OGI technology, when combined with advanced semantic segmentation models, to accurately predict and quantify methane emissions in livestock farming. This approach could contribute to the development of effective mitigation strategies aimed at reducing the environmental impact of methane emissions and addressing climate change.</p><p>We hope that the papers presented in this special issue reflect the significant progress made by modern methods in advancing sustainable agriculture. Thanks to recent advances in image and video processing, these techniques are now widely disseminated and capable of supporting innovative farming strategies worldwide. These developments have the potential to contribute to the Sustainable Development Goals outlined in the 2023 Agenda for Sustainable Development.</p><p><b>Davide Moroni</b>: conceptualization, writing – original draft, writing – review & editing. <b>Dimitrios Kosmopoulos</b>: conceptualization, writing – original draft, writing – review & editing.</p><p>The authors declare no conflicts of interest.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70032","citationCount":"0","resultStr":"{\"title\":\"Guest Editorial: New Frontiers in Image and Video Processing for Sustainable Agriculture\",\"authors\":\"Davide Moroni, Dimitrios Kosmopoulos\",\"doi\":\"10.1049/ipr2.70032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The rapidly evolving landscape of image processing, with the integration of cutting-edge technologies such as deep learning, has expanded its influence across various sectors. Agriculture, being a pillar of sustainable development, is on the cusp of a major technological transformation, necessitating the synergy of advanced sensors, image processing and machine learning. Recognizing the symbiotic relationship between image processing advancements and the agricultural domain's intrinsic challenges, this special issue aims to bring to the fore the innovative applications of advanced image processing methodologies in agriculture to enable sustainable production. The focus is not only on addressing agricultural challenges but also on unraveling new research trajectories in image processing that could ripple into other sectors like remote sensing, robotics and photogrammetry. The current special issue is aligned with the Sustainable Development Goals outlined in the 2030 agenda for sustainable development. Conversely, the agricultural domain provides a fertile ground for research challenges that motivate the exploration of new avenues.</p><p>In this Special Issue, we received 25 submissions, all of which underwent a thorough peer-review process. Of these, eight papers were accepted for publication. The high volume of submissions, along with the quality of the accepted papers, highlights the relevance and success of this Special Issue. Six of the eight accepted papers address issues related to common RGB images collected in the field by conventional sensors, showcasing the significant potential of computer vision in monitoring plants, pests and diseases, as well as assessing the quality and maturity of crops. Another paper focuses on advanced transformer-like methods for achieving super-resolution in remote sensing images, which proves particularly beneficial for multi-spectral imaging of crops. An additional paper is dedicated to livestock monitoring, offering quantitative methods for evaluating their impact on climate change.</p><p>More in detail, Sun and Huo present a robust maize leaf disease identification model aimed at delivering precise recognition of images taken at close range. They introduce an enhanced lightweight network based on the EfficientNet architecture. To this end, the model replaces the squeeze-and-excitation module found in the MBConv component (a typical EfficientNet module) with the convolutional block attention module (CBAM). This modification allows the network to focus on channel-wise correlations while adaptively learning the attentional weight of each spatial location. Moreover, a multi-scale feature fusion layer utilizing residual connections is incorporated to extract more comprehensive and richer disease features across varying scales. Experimental results indicate that this proposed model outperforms several popular architectures. Additionally, the authors provide an analysis of activation maps, which qualitatively demonstrates that the newly implemented model effectively minimizes background interference—a common issue in field-captured images due to factors such as lighting conditions, shooting angles, noise and intricate backgrounds. These challenges generally complicate the extraction of key feature information during model training.</p><p>Kiratiratanapruk et al. explore a computer vision-based approach to rice disease diagnosis, addressing real-world challenges encountered in rice field imagery. Their study considers critical factors such as environmental conditions and the varying sizes of rice leaves, aspects that are often under-represented in academic research. The proposed method integrates convolutional neural network (CNN) object detection with an image tiling strategy, where the division of images is guided by an automatically estimated rice leaf width. To achieve this, a specialized CNN model is developed using an 18-layer ResNet architecture for leaf width estimation. This model is trained on a newly generated set of tiled sub-images, ensuring that uniformly sized objects are used in training the rice disease prediction model. The study was validated on a dataset of 4960 images, covering eight distinct rice leaf diseases. Notably, the approach proves effective in managing variations in object size, significantly improving the accuracy of disease detection in practical field conditions.</p><p>Cho et al. explore a data augmentation technique for plant disease classification and early diagnosis, leveraging a generative adversarial network (GAN) to partially mitigate the challenges posed by limited dataset availability in agricultural computer vision. In deep learning–based classification models, data imbalance is a key issue that often hampers performance. To address this, the authors utilize tomato disease images from the publicly available PlantVillage dataset to assess the effectiveness of the GauGAN (Gaussian-GAN) algorithm. Their study highlights the impact of synthetic image generation in enhancing model training. The proposed GauGAN model is employed to generate additional training data for a MobileNet-based classification model. Its performance is then compared against models trained using traditional data augmentation techniques and the cut-mix and mix-up algorithms. Experimental findings, based on F1-scores, indicate that the GauGAN-based augmentation strategy outperforms conventional methods by over 10%, demonstrating its effectiveness in improving classification accuracy.</p><p>Huo et al. propose an enhanced multi-scale YOLOv8 model for detecting and recognizing dense lesions on apple leaves. The detection of apple leaf lesions presents significant challenges due to the wide range of species, diverse morphologies, varying lesion sizes and complex backgrounds. To address these challenges, the proposed YOLOv8 incorporates an improved C2f-RFEM module in the backbone network, enhancing the extraction of disease-related features. Additionally, a newly designed neck network integrates the C2f-DCN and C2f-DCN-EMA modules, which leverage deformable convolutions and an advanced attention mechanism to refine feature representation. To further enhance the detection of small-scale lesions, the model introduces a high-resolution detection head, improving recognition capabilities across multiple scales. The effectiveness of the improved YOLOv8 is validated using the COCO dataset, which includes 80 object categories, and an apple leaf disease dataset comprising eight disease types. Experimental results show that the enhanced model achieves superior performance in terms of mean average precision (mAP) and floating point operations (FLOPs) compared to baseline YOLOv8, Faster R-CNN, RetinaNet, SSD and YOLOv5s. In terms of parameter count and model size, the improved YOLOv8 gives competitive performance.</p><p>Du et al. introduce an improved variant of the YOLO family, extending its capabilities with respect to previous papers mentioned above beyond disease detection to include pest identification. Building on YOLOv7, the proposed model incorporates a progressive spatial adaptive feature pyramid (PSAFP) to enhance multi-scale feature representation. Additionally, the authors employ a combination of varifocal loss and rank-based mining loss to refine the object loss calculation, effectively reducing the impact of irrelevant negative samples during training. Evaluations conducted on the filtered PlantVillage dataset and the rice-corn pest dataset demonstrate the effectiveness of YOLOv7-PSAFP, which outperforms the baseline YOLOv7 model.</p><p>Building on a YOLO-based architecture, Ling et al. explore adaptive object detection with an enhanced model, shifting the focus from threat quantification, as addressed in the works introduced above, to maturity detection in raspberry cultivation. To tackle this challenge, the authors propose HSA-YOLOv5 (HSV Self-Adaptive YOLOv5), a method designed to identify raspberries at different ripeness stages—immature, nearly ripe and fully ripe. The approach involves converting images from the standard RGB colour space to an optimized HSV colour space. The method improves data representation by fine-tuning parameters and enhancing contrast among visually similar hues while preserving key image features. Additionally, adaptive HSV parameter selection is applied based on varying weather conditions, ensuring consistent preprocessing across the dataset. The improved model is evaluated against the standard YOLOv5 using a custom-built dataset. Experimental results indicate that the enhanced approach achieves an mAP of 0.97, marking a 6.42 percentage point improvement over the baseline YOLOv5 model.</p><p>Rossi et al. explore advanced image processing techniques, leveraging a combination of Swin Transformer and mixture-of-experts models to enhance image quality in remote sensing. Their approach has potential applications in agriculture, particularly for improving the accuracy and usability of multi-spectral imagery. The authors introduce Swin2-MoSE, a novel single-image super-resolution model specifically designed for remote sensing tasks. A key innovation of this model is a newly designed layer that effectively merges outputs from individual experts, alongside the implementation of a per-example strategy—an improvement over the more commonly used per-token approach. Additionally, they investigate the interaction between positional encodings, demonstrating how per-channel and per-head biases can complement each other to improve performance. To further enhance model effectiveness, the study incorporates a loss function that combines normalized cross-correlation (NCC) and structural similarity index measure (SSIM), addressing the known limitations of mean squared error (MSE) loss. Experimental evaluations reveal that Swin2-MoSE outperforms existing Swin-based models, on the Sen2Venµs and OLI2MSI datasets.</p><p>Finally, Embaby et al. explore the application of imaging technologies in livestock management and sustainability by introducing an optical gas imaging (OGI) and deep learning framework for quantifying enteric methane emissions from rumen fermentation in vitro. To achieve this, the authors propose a novel architecture called Gasformer, designed to enhance the detection and measurement of methane emissions. The model demonstrates competitive performance, which highlights the potential of OGI technology, when combined with advanced semantic segmentation models, to accurately predict and quantify methane emissions in livestock farming. This approach could contribute to the development of effective mitigation strategies aimed at reducing the environmental impact of methane emissions and addressing climate change.</p><p>We hope that the papers presented in this special issue reflect the significant progress made by modern methods in advancing sustainable agriculture. Thanks to recent advances in image and video processing, these techniques are now widely disseminated and capable of supporting innovative farming strategies worldwide. These developments have the potential to contribute to the Sustainable Development Goals outlined in the 2023 Agenda for Sustainable Development.</p><p><b>Davide Moroni</b>: conceptualization, writing – original draft, writing – review & editing. <b>Dimitrios Kosmopoulos</b>: conceptualization, writing – original draft, writing – review & editing.</p><p>The authors declare no conflicts of interest.</p>\",\"PeriodicalId\":56303,\"journal\":{\"name\":\"IET Image Processing\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-03-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70032\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Image Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70032\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70032","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Guest Editorial: New Frontiers in Image and Video Processing for Sustainable Agriculture
The rapidly evolving landscape of image processing, with the integration of cutting-edge technologies such as deep learning, has expanded its influence across various sectors. Agriculture, being a pillar of sustainable development, is on the cusp of a major technological transformation, necessitating the synergy of advanced sensors, image processing and machine learning. Recognizing the symbiotic relationship between image processing advancements and the agricultural domain's intrinsic challenges, this special issue aims to bring to the fore the innovative applications of advanced image processing methodologies in agriculture to enable sustainable production. The focus is not only on addressing agricultural challenges but also on unraveling new research trajectories in image processing that could ripple into other sectors like remote sensing, robotics and photogrammetry. The current special issue is aligned with the Sustainable Development Goals outlined in the 2030 agenda for sustainable development. Conversely, the agricultural domain provides a fertile ground for research challenges that motivate the exploration of new avenues.
In this Special Issue, we received 25 submissions, all of which underwent a thorough peer-review process. Of these, eight papers were accepted for publication. The high volume of submissions, along with the quality of the accepted papers, highlights the relevance and success of this Special Issue. Six of the eight accepted papers address issues related to common RGB images collected in the field by conventional sensors, showcasing the significant potential of computer vision in monitoring plants, pests and diseases, as well as assessing the quality and maturity of crops. Another paper focuses on advanced transformer-like methods for achieving super-resolution in remote sensing images, which proves particularly beneficial for multi-spectral imaging of crops. An additional paper is dedicated to livestock monitoring, offering quantitative methods for evaluating their impact on climate change.
More in detail, Sun and Huo present a robust maize leaf disease identification model aimed at delivering precise recognition of images taken at close range. They introduce an enhanced lightweight network based on the EfficientNet architecture. To this end, the model replaces the squeeze-and-excitation module found in the MBConv component (a typical EfficientNet module) with the convolutional block attention module (CBAM). This modification allows the network to focus on channel-wise correlations while adaptively learning the attentional weight of each spatial location. Moreover, a multi-scale feature fusion layer utilizing residual connections is incorporated to extract more comprehensive and richer disease features across varying scales. Experimental results indicate that this proposed model outperforms several popular architectures. Additionally, the authors provide an analysis of activation maps, which qualitatively demonstrates that the newly implemented model effectively minimizes background interference—a common issue in field-captured images due to factors such as lighting conditions, shooting angles, noise and intricate backgrounds. These challenges generally complicate the extraction of key feature information during model training.
Kiratiratanapruk et al. explore a computer vision-based approach to rice disease diagnosis, addressing real-world challenges encountered in rice field imagery. Their study considers critical factors such as environmental conditions and the varying sizes of rice leaves, aspects that are often under-represented in academic research. The proposed method integrates convolutional neural network (CNN) object detection with an image tiling strategy, where the division of images is guided by an automatically estimated rice leaf width. To achieve this, a specialized CNN model is developed using an 18-layer ResNet architecture for leaf width estimation. This model is trained on a newly generated set of tiled sub-images, ensuring that uniformly sized objects are used in training the rice disease prediction model. The study was validated on a dataset of 4960 images, covering eight distinct rice leaf diseases. Notably, the approach proves effective in managing variations in object size, significantly improving the accuracy of disease detection in practical field conditions.
Cho et al. explore a data augmentation technique for plant disease classification and early diagnosis, leveraging a generative adversarial network (GAN) to partially mitigate the challenges posed by limited dataset availability in agricultural computer vision. In deep learning–based classification models, data imbalance is a key issue that often hampers performance. To address this, the authors utilize tomato disease images from the publicly available PlantVillage dataset to assess the effectiveness of the GauGAN (Gaussian-GAN) algorithm. Their study highlights the impact of synthetic image generation in enhancing model training. The proposed GauGAN model is employed to generate additional training data for a MobileNet-based classification model. Its performance is then compared against models trained using traditional data augmentation techniques and the cut-mix and mix-up algorithms. Experimental findings, based on F1-scores, indicate that the GauGAN-based augmentation strategy outperforms conventional methods by over 10%, demonstrating its effectiveness in improving classification accuracy.
Huo et al. propose an enhanced multi-scale YOLOv8 model for detecting and recognizing dense lesions on apple leaves. The detection of apple leaf lesions presents significant challenges due to the wide range of species, diverse morphologies, varying lesion sizes and complex backgrounds. To address these challenges, the proposed YOLOv8 incorporates an improved C2f-RFEM module in the backbone network, enhancing the extraction of disease-related features. Additionally, a newly designed neck network integrates the C2f-DCN and C2f-DCN-EMA modules, which leverage deformable convolutions and an advanced attention mechanism to refine feature representation. To further enhance the detection of small-scale lesions, the model introduces a high-resolution detection head, improving recognition capabilities across multiple scales. The effectiveness of the improved YOLOv8 is validated using the COCO dataset, which includes 80 object categories, and an apple leaf disease dataset comprising eight disease types. Experimental results show that the enhanced model achieves superior performance in terms of mean average precision (mAP) and floating point operations (FLOPs) compared to baseline YOLOv8, Faster R-CNN, RetinaNet, SSD and YOLOv5s. In terms of parameter count and model size, the improved YOLOv8 gives competitive performance.
Du et al. introduce an improved variant of the YOLO family, extending its capabilities with respect to previous papers mentioned above beyond disease detection to include pest identification. Building on YOLOv7, the proposed model incorporates a progressive spatial adaptive feature pyramid (PSAFP) to enhance multi-scale feature representation. Additionally, the authors employ a combination of varifocal loss and rank-based mining loss to refine the object loss calculation, effectively reducing the impact of irrelevant negative samples during training. Evaluations conducted on the filtered PlantVillage dataset and the rice-corn pest dataset demonstrate the effectiveness of YOLOv7-PSAFP, which outperforms the baseline YOLOv7 model.
Building on a YOLO-based architecture, Ling et al. explore adaptive object detection with an enhanced model, shifting the focus from threat quantification, as addressed in the works introduced above, to maturity detection in raspberry cultivation. To tackle this challenge, the authors propose HSA-YOLOv5 (HSV Self-Adaptive YOLOv5), a method designed to identify raspberries at different ripeness stages—immature, nearly ripe and fully ripe. The approach involves converting images from the standard RGB colour space to an optimized HSV colour space. The method improves data representation by fine-tuning parameters and enhancing contrast among visually similar hues while preserving key image features. Additionally, adaptive HSV parameter selection is applied based on varying weather conditions, ensuring consistent preprocessing across the dataset. The improved model is evaluated against the standard YOLOv5 using a custom-built dataset. Experimental results indicate that the enhanced approach achieves an mAP of 0.97, marking a 6.42 percentage point improvement over the baseline YOLOv5 model.
Rossi et al. explore advanced image processing techniques, leveraging a combination of Swin Transformer and mixture-of-experts models to enhance image quality in remote sensing. Their approach has potential applications in agriculture, particularly for improving the accuracy and usability of multi-spectral imagery. The authors introduce Swin2-MoSE, a novel single-image super-resolution model specifically designed for remote sensing tasks. A key innovation of this model is a newly designed layer that effectively merges outputs from individual experts, alongside the implementation of a per-example strategy—an improvement over the more commonly used per-token approach. Additionally, they investigate the interaction between positional encodings, demonstrating how per-channel and per-head biases can complement each other to improve performance. To further enhance model effectiveness, the study incorporates a loss function that combines normalized cross-correlation (NCC) and structural similarity index measure (SSIM), addressing the known limitations of mean squared error (MSE) loss. Experimental evaluations reveal that Swin2-MoSE outperforms existing Swin-based models, on the Sen2Venµs and OLI2MSI datasets.
Finally, Embaby et al. explore the application of imaging technologies in livestock management and sustainability by introducing an optical gas imaging (OGI) and deep learning framework for quantifying enteric methane emissions from rumen fermentation in vitro. To achieve this, the authors propose a novel architecture called Gasformer, designed to enhance the detection and measurement of methane emissions. The model demonstrates competitive performance, which highlights the potential of OGI technology, when combined with advanced semantic segmentation models, to accurately predict and quantify methane emissions in livestock farming. This approach could contribute to the development of effective mitigation strategies aimed at reducing the environmental impact of methane emissions and addressing climate change.
We hope that the papers presented in this special issue reflect the significant progress made by modern methods in advancing sustainable agriculture. Thanks to recent advances in image and video processing, these techniques are now widely disseminated and capable of supporting innovative farming strategies worldwide. These developments have the potential to contribute to the Sustainable Development Goals outlined in the 2023 Agenda for Sustainable Development.
期刊介绍:
The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications.
Principal topics include:
Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality.
Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing.
Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing.
Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video.
Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography.
Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security.
Current Special Issue Call for Papers:
Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf
AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf
Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf
Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf