{"title":"Multi-Fatigue Feature Selection and Fuzzy Logic-Based Intelligent Driver Drowsiness Detection","authors":"Mohan Arava, Divya Meena Sundaram","doi":"10.1049/ipr2.70052","DOIUrl":"https://doi.org/10.1049/ipr2.70052","url":null,"abstract":"<p>Driver drowsiness poses a critical threat, frequently resulting in highly perilous traffic accidents. The drowsiness detection is complicated by various challenges such as lighting conditions, occluded facial features, eyeglasses, and false alarms, making the accuracy, robustness across environments, and computational efficiency a major challenge. This study proposes a non-intrusive driver drowsiness detection system, leveraging image processing techniques and advanced fuzzy logic methods. It also introduces improvements to the Viola-Jones algorithm for swift and precise driver face, eye, and mouth identification. Extensive experiments involving diverse individuals and scenarios were conducted to assess the system's performance in detecting eye and mouth states. The results are highly promising, with eye detection accuracy at 91.8% and mouth detection achieving a remarkable 94.6%, surpassing existing methods. Real-time testing in varied conditions, including day and night scenarios and subjects with and without glasses, demonstrated the system's robustness, yielding a 97.5% test accuracy in driver drowsiness detection.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70052","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143762089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Depth Completion and Inpainting for Specular Objects","authors":"He Liu, Yi Sun","doi":"10.1049/ipr2.70049","DOIUrl":"https://doi.org/10.1049/ipr2.70049","url":null,"abstract":"<p>Depth images or point clouds offer true three-dimensional insights into scene geometry, making depth perception essential for downstream tasks in computer vision. However, current commercial depth sensors often produce dense estimations with lower accuracy, especially on specular surfaces, leading to noisy and incomplete data. To address this challenge, we propose a novel framework based on latent diffusion models conditioned on RGBD images and semantic labels for depth completion and inpainting, effectively restoring depth values for both visible and occluded parts of specular objects. We enhance geometric guidance by designing various visual descriptors as conditions and introduce channel and spatial attention mechanisms in the conditional encoder to improve multi-modal feature fusion. Using the MP6D dataset, we render complete and dense depth images for benchmarking, enabling a comprehensive evaluation of our method against existing approaches. Extensive experiments demonstrate that our model outperforms previous methods, significantly improving the performance of downstream tasks by incorporating the predicted depth maps restored by our model.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70049","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143762090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Retinex-Based Network for Low-Light Image Enhancement With Multi-Scale Denoising and Focal-Aware Reflections","authors":"Peng Ji, Zhongyou Lv, Zhao Zhang","doi":"10.1049/ipr2.70059","DOIUrl":"https://doi.org/10.1049/ipr2.70059","url":null,"abstract":"<p>Low-light image enhancement addresses critical challenges in computer vision, including insufficient brightness, excessive noise, and loss of detail in low-light images, thus improving the quality and applicability of image data for various vision tasks. We propose an unsupervised Retinex-based network for low-light image enhancement, incorporating a multi-scale denoiser and focal-aware reflections. Our approach begins with a multi-scale denoising network that removes noise and redundant features while preserving both global and local image details. Subsequently, we employ an illumination separation network and a focal-aware reflection network to extract the illumination and reflection components, respectively. To enhance the accuracy of the reflection component and capture finer image details, we introduce a depthwise convolutional focal modulation block. This block improves the representative capacity of the reflection component feature map. Finally, we adjust the illumination component and synthesize it with the reflection component to generate the enhanced image. Extensive experiments conducted on 9 datasets and 13 methods, using metrics such as SSIM, PSNR, LPIPS, NIQE, and BRISQUE, demonstrate that the proposed method outperforms existing unsupervised approaches and shows competitive performance when compared to supervised methods.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70059","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143762088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Coarse-Grained Ore Distribution on Conveyor Belts With TRCU Neural Networks","authors":"Weinong Liang, Xiaolu Sun, Yutao Li, Yang Liu, Guanghui Wang, Jincheng Wang, Chunxia Zhou","doi":"10.1049/ipr2.70057","DOIUrl":"https://doi.org/10.1049/ipr2.70057","url":null,"abstract":"<p>The particle size distribution of ore is a key evaluation indicator of the degree of ore fragmentation and plays a key role in the separation of mineral processing. Traditional ore size detection is often done by manual sieving, which takes a great deal of time and labor. In this work, a deep learning network model (referred to as TRCU), combining Transformer with residual blocks and CBAM attention mechanism in an encoder-decoder structure was developed for particle size detection of medium and large particles in a wide range of particle sizes in an ore material transportation scenario. This model presents a unique approach to improve the accuracy of identifying ore regions in images, utilizing three key features. Firstly, the model utilizes the CBAM attention mechanism to increase the weighting of ore regions in the feature fusion channel; secondly, a Transformer module is used to enhance the correlation of features in coarse-grained ore image regions in the deepest encoding and decoding stages; finally, the residual module is used to enhance useful feature information and reduce noise. The validation experiments are conducted on a transport belt dataset with large variation in particle size and low contrast. The results show that the proposed model can capture the edges of different particle sizes and achieve accurate segmentation of large particle size ore images. The MIoU values of 82.44%, MPA of 90.21%, and accuracy of 94.91% are higher than those of other existing methods. This work proposes a reliable method for automated detection of mineral particle size and will promote the automation level of ore processing.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70057","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143749520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yourui Huang, Xi Feng, Tao Han, Hongping Song, Yuwen Liu, Meiping Bao
{"title":"GDS-YOLO: A Rice Diseases Identification Model With Enhanced Feature Extraction Capability","authors":"Yourui Huang, Xi Feng, Tao Han, Hongping Song, Yuwen Liu, Meiping Bao","doi":"10.1049/ipr2.70034","DOIUrl":"https://doi.org/10.1049/ipr2.70034","url":null,"abstract":"<p>Accurate identification of rice diseases is a prerequisite for improving rice yield and quality. However, the rice diseases are complex, and the existing identification models have the problem of weak ability to extract rice disease features. To address this issue, this paper proposes a rice disease identification model with enhanced feature extraction capability, named GDS-YOLO. The proposed GDS-YOLO model improves the YOLOv8n model by introducing the GsConv module, the Dysample module, the spatial context-aware module (SCAM) and WIoU v3 loss functions. The GsConv module reduces the model's number of parameters and computational complexity. The Dysample module reduces the loss of the rice diseases feature during the extraction process. The SCAM module allows the model to ignore the influence of complex backgrounds and focus on extracting rice disease features. The WIoU v3 loss function optimises the regression box loss of rice disease features. Compared with the YOLOv8n model, the P and mAP50 of GDS-YOLO increased by 5.4% and 4.1%, respectively, whereas the number of parameters and GFLOPS decreased by 23% and 10.1%, respectively. The experimental results show that the model proposed in this paper reduces the model complexity to a certain extent and achieves good rice diseases identification results.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143741611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LCM-YOLO: A Small Object Detection Method for UAV Imagery Based on YOLOv5","authors":"Shaodong Liu, Faming Shao, Weijun Chu, Heng Zhang, Dewei Zhao, Jinhong Xue, Qing Liu","doi":"10.1049/ipr2.70051","DOIUrl":"https://doi.org/10.1049/ipr2.70051","url":null,"abstract":"<p>This study addresses the challenges of detecting small targets and targets with significant scale variations in UAV aerial images. We propose an improved YOLOv5 model, named LCM-YOLO, to tackle these challenges. Initially, a local fusion mechanism is introduced into the C3 module, forming the C3-LFM module to enhance feature information acquisition during feature extraction. Subsequently, the CCFM is employed as the neck structure of the network, leveraging its lightweight convolution and cross-scale feature fusion characteristics to effectively improve the model's ability to integrate target features at different levels, thereby enhancing its adaptability to scale variations and detection performance for small targets. Additionally, a multi-head attention mechanism is integrated at the front end of the detection head, allowing the model to focus more on the detailed information of small targets through weight distribution. Experiments on the VisDrone2019 dataset show that LCM-YOLO has excellent detection capabilities. Compared to the original YOLOv5 model, its mAP50 and mAP50-95 metrics are improved by 7.2% and 5.1%, respectively, reaching 40.7% and 22.5%. This validates the effectiveness of the LCM-YOLO model for detecting small and multi-scale targets in complex backgrounds.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70051","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143741147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Guest Editorial: New Frontiers in Image and Video Processing for Sustainable Agriculture","authors":"Davide Moroni, Dimitrios Kosmopoulos","doi":"10.1049/ipr2.70032","DOIUrl":"https://doi.org/10.1049/ipr2.70032","url":null,"abstract":"<p>The rapidly evolving landscape of image processing, with the integration of cutting-edge technologies such as deep learning, has expanded its influence across various sectors. Agriculture, being a pillar of sustainable development, is on the cusp of a major technological transformation, necessitating the synergy of advanced sensors, image processing and machine learning. Recognizing the symbiotic relationship between image processing advancements and the agricultural domain's intrinsic challenges, this special issue aims to bring to the fore the innovative applications of advanced image processing methodologies in agriculture to enable sustainable production. The focus is not only on addressing agricultural challenges but also on unraveling new research trajectories in image processing that could ripple into other sectors like remote sensing, robotics and photogrammetry. The current special issue is aligned with the Sustainable Development Goals outlined in the 2030 agenda for sustainable development. Conversely, the agricultural domain provides a fertile ground for research challenges that motivate the exploration of new avenues.</p><p>In this Special Issue, we received 25 submissions, all of which underwent a thorough peer-review process. Of these, eight papers were accepted for publication. The high volume of submissions, along with the quality of the accepted papers, highlights the relevance and success of this Special Issue. Six of the eight accepted papers address issues related to common RGB images collected in the field by conventional sensors, showcasing the significant potential of computer vision in monitoring plants, pests and diseases, as well as assessing the quality and maturity of crops. Another paper focuses on advanced transformer-like methods for achieving super-resolution in remote sensing images, which proves particularly beneficial for multi-spectral imaging of crops. An additional paper is dedicated to livestock monitoring, offering quantitative methods for evaluating their impact on climate change.</p><p>More in detail, Sun and Huo present a robust maize leaf disease identification model aimed at delivering precise recognition of images taken at close range. They introduce an enhanced lightweight network based on the EfficientNet architecture. To this end, the model replaces the squeeze-and-excitation module found in the MBConv component (a typical EfficientNet module) with the convolutional block attention module (CBAM). This modification allows the network to focus on channel-wise correlations while adaptively learning the attentional weight of each spatial location. Moreover, a multi-scale feature fusion layer utilizing residual connections is incorporated to extract more comprehensive and richer disease features across varying scales. Experimental results indicate that this proposed model outperforms several popular architectures. Additionally, the authors provide an analysis of activation maps, which qualitatively ","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70032","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143726870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Facial Blemish Detection Based on YOLOv8n Optimised with Space-to-Depth and GCNet Attention Mechanisms","authors":"Shuxi Zhou, Lijun Liang","doi":"10.1049/ipr2.70039","DOIUrl":"https://doi.org/10.1049/ipr2.70039","url":null,"abstract":"<p>Facial blemishes are small and often similar in colour to the surrounding skin, making detection even more challenging. This paper proposes an improved algorithm based on YOLOv8 to address the limitations of the original YOLOv8n in facial blemish detection. First, we introduce space-to-depth-convolution (SPD-Conv), which replaces traditional downsampling methods in convolutional neural networks, preserving spatial details without reducing the image resolution. This enhances the model's ability to detect small imperfections. Additionally, the integration of GCNet helps detect blemishes that closely resemble surrounding skin tones by leveraging global context modelling. The improved model better understands the overall structure and features of the face. Experimental results show that our model achieves a 5.3% and 5.6% improvement in mAP50 and mAP50-95, respectively, over YOLOv8n. Furthermore, it outperforms the latest YOLOv11n model by 6.9% and 7.2% in mAP50 and mAP50-95.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143707634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
RenKai Xiao, ShengZhi Yuan, Kai Jin, Min Li, Yan Tang, Sen Shen
{"title":"CMFNet: A Three-Stage Feature Matching Network With Geometric Consistency and Attentional Enhancement","authors":"RenKai Xiao, ShengZhi Yuan, Kai Jin, Min Li, Yan Tang, Sen Shen","doi":"10.1049/ipr2.70050","DOIUrl":"https://doi.org/10.1049/ipr2.70050","url":null,"abstract":"<p>Current feature matching methods typically employ a two-stage process, consisting of coarse and fine matching. However, the transition from the coarse to the fine stage often lacks an effective intermediate state, leading to abrupt changes in the matching process. This can hinder smooth transitions and precise localization. To address these limitations, this study introduces Coarse-Mid-Fine Match Net (CMFNet), a novel three-stage image feature matching method. CMFNet incorporates an intermediate-grained matching phase between the coarse and fine stages to facilitate a more gradual and seamless transition. In the proposed method, the intermediate-grained matching refines the correspondences obtained from the coarse-grained stage using Adaptive-random sample consensus (RANSAC). Subsequently, the midtransformer, which integrates sparse self-attention (SSA) mechanisms with local-feature-based cross-attention, is employed for feature extraction. This approach enhances the feature extraction capabilities and improves the adaptability to various types of image data, thereby boosting overall matching performance. Additionally, a cross-attention mechanism based on local region features is introduced. The network undergoes fully self-supervised training, aiming to minimize a match loss that is autonomously generated from the training data using a multi-scale cross-entropy method. A series of thorough experiments was carried out on diverse real-world datasets, including both unaltered and extensively processed images.The results demonstrate that the proposed method outperforms state-of-the-art approaches, achieving 0.776 mAUC on the HPatches dataset and 0.442 mAUC on the ISC-HE dataset.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70050","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143717443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EANet: Integrate Edge Features and Attention Mechanisms Multi-Scale Networks for Vessel Segmentation in Retinal Images","authors":"Jiangyi Zhang, Yuxin Tan, Duantengchuan Li, Guanghui Xu, Fuling Zhou","doi":"10.1049/ipr2.70056","DOIUrl":"https://doi.org/10.1049/ipr2.70056","url":null,"abstract":"<p>Accurately extracting blood vessel structures from retinal fundus images is critical for the early diagnosis and treatment of various ocular and systemic diseases. However, retinal vessel segmentation continues to face significant challenges. Firstly, capturing the boundary information of small vessels is particularly difficult. Secondly, uneven vessel thickness and irregular distribution further complicate the multi-scale feature modelling. Lastly, low-contrast images lead to increased background noise, further affecting the segmentation accuracy. To tackle these challenges, this article presents a multi-scale segmentation network that combines edge features and attention mechanisms, referred to as EANet. It demonstrates significant advantages over existing methods. Specifically, EANet consists of three key modules: the edge feature enhancement module, the multi-scale information interaction encoding module, and the multi-class attention mechanism decoding module. Experimental results validate the effectiveness of the method. Specifically, EANet outperforms existing advanced methods in the precise segmentation of small and multi-scale vessels and in effectively filtering background noise to maintain segmentation continuity.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70056","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143707394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}