IET Image Processing最新文献

筛选
英文 中文
An X-Ray Contraband Detection Method Based on Improved YOLOv8 基于改进YOLOv8的x射线违禁品检测方法
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-06-17 DOI: 10.1049/ipr2.70135
Jianing Chen, Juan Hao, Xiaoqun Liu
{"title":"An X-Ray Contraband Detection Method Based on Improved YOLOv8","authors":"Jianing Chen,&nbsp;Juan Hao,&nbsp;Xiaoqun Liu","doi":"10.1049/ipr2.70135","DOIUrl":"https://doi.org/10.1049/ipr2.70135","url":null,"abstract":"<p>X-ray detection of contraband is crucial for public safety; however, it often faces challenges due to cluttered backgrounds and overlapping objects in security inspection images. This study proposes a novel detection framework based on You Only Look Once version 8 (YOLOv8), incorporating three key innovations: multi-scale cross-axis attention (MCA), which captures global dependencies through horizontal and vertical collaborative attention, effectively mitigating irrelevant features in complex X-ray scenarios; a lightweight bottleneck architecture using partial convolution (PConv), which significantly reduces floating point operations (FLOPs) while preserving positional sensitivity; and the focal-enhanced intersection over union (Focaler-IoU) loss function, which dynamically weights difficult samples to enhance regression accuracy. Experiments on the prohibited item detection in the X-ray dataset revealed that our model achieves a mean average precision (IoU = 0.5) ([email protected]) of 97.3%, outperforming YOLOv8s by 1.2 percentage points, and maintains real-time performance of 121 frames per second, surpassing YOLOv10-S (96.5%) and YOLOv12-S (96.8%). Ablation studies highlight the contribution of each module: MCA enhances mAP by 0.7%, PConv decreases FLOPs by 31%, and Focaler-IoU increases precision by 0.9% and recall by 2.4%. The proposed method exhibits substantial potential for real-time security inspections.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70135","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144300336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SEM-YOLO: A Small Target Defect Detection Model for Photovoltaic Modules SEM-YOLO:光伏组件小目标缺陷检测模型
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-06-17 DOI: 10.1049/ipr2.70134
Wang Yun, Yin Wang, Gang Xie, Zhicheng Zhao
{"title":"SEM-YOLO: A Small Target Defect Detection Model for Photovoltaic Modules","authors":"Wang Yun,&nbsp;Yin Wang,&nbsp;Gang Xie,&nbsp;Zhicheng Zhao","doi":"10.1049/ipr2.70134","DOIUrl":"https://doi.org/10.1049/ipr2.70134","url":null,"abstract":"<p>Defect detection is key to extending the lifetime of PV modules. However, existing methods still face significant challenges in detecting small and ambiguous targets. To this end, this paper proposes a PV module defect detection model, SEM-YOLO, based on YOLOv8. The model improves the performance through the following improvements: first, the SPD-Conv module is introduced to replace the traditional convolution in the backbone and neck sections to reduce the information loss caused by excessive down-sampling, thus enhancing the detection of small targets. Second, the neck section C2f-EMA module is introduced, in which the efficient multiscale attention module (EMA) enhances feature extraction by redistributing weights and prioritizing relevant features to improve the perception and recognition of small target defects (hot spots). Finally, we add a small target detection layer and increase the MultiSEAM detection header, so that the model can capture and detect small targets more efficiently at the output stage. The experimental results show that the mAP of the improved model reaches 93.8%, among which the mAP of small target defects reaches 83%, which is an improvement of 2.23% and 7.62% compared with YOLOv8. In addition, compared with the mainstream models (RT-DETR, YOLOv9s, YOLOv10n, and YOLOv11), the detection accuracies in terms of overall and small-target defects are significantly improved, which further validates the effectiveness of the model.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70134","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144300338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Lightweight Channel Correlation Invertible Network for Image Denoising 一种用于图像去噪的轻量级信道相关可逆网络
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-06-17 DOI: 10.1049/ipr2.70119
Fuxian Sui, Hua Wang, Fan Zhang
{"title":"A Lightweight Channel Correlation Invertible Network for Image Denoising","authors":"Fuxian Sui,&nbsp;Hua Wang,&nbsp;Fan Zhang","doi":"10.1049/ipr2.70119","DOIUrl":"https://doi.org/10.1049/ipr2.70119","url":null,"abstract":"<p>In recent years, deep learning has made significant progress in image denoising. However, the complexity of advanced methods' systems is also increasing, which will increase the calculation cost and hinder the convenient analysis and comparison of methods. Therefore, a lightweight model based on invertible networks is proposed. The invertible network has great advantages in image denoising. It is lightweight, memory-saving, and information-lossless in backpropagation. To effectively remove the noise and restore a clean image, the high-frequency part of the image is resampled and modeled to remove the impact of noise better. The channel context block is proposed to better focus on useful channels and improve the network's perception of useful information in images while ensuring the complexity and computing cost. At the same time, the residual structure with channel correlation modeling is used to extract the features in the convolutional flow, to effectively retain the details and texture of the image, and learn more details of the spatial features of the image, so as to prevent the blur and distortion of the image in the denoising process. The proposed method allows the model to enjoy lower computational complexity on the premise of ensuring performance.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70119","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144300337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Data-Driven Solution for Large-Scale Open-Pit Mines Excavation Monitoring Based on 3D Point Cloud 基于三维点云的大型露天矿开挖监测数据驱动解决方案
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-06-16 DOI: 10.1049/ipr2.70130
Taiming He, Jiasui Zhang, Lu Yang
{"title":"A Data-Driven Solution for Large-Scale Open-Pit Mines Excavation Monitoring Based on 3D Point Cloud","authors":"Taiming He,&nbsp;Jiasui Zhang,&nbsp;Lu Yang","doi":"10.1049/ipr2.70130","DOIUrl":"https://doi.org/10.1049/ipr2.70130","url":null,"abstract":"<p>We present an adaptive point cloud workflow that withstands heavy environmental noise and the large datasets typical of open-pit mines. The workflow automatically tunes its parameters from the statistics of each input scene, eliminating manual parameter tuning. For instance, it sets the ICP correspondence distance and the clustering threshold without user input. Additionally, our method integrates a coarse-to-fine registration strategy, robust change detection, and precise volumetric estimation based on digital elevation models. Experiments on simulated mining datasets show our method remains robust under heavy noise and misalignment, with volume errors consistently below <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mn>2</mn>\u0000 <mo>%</mo>\u0000 </mrow>\u0000 <annotation>$2%$</annotation>\u0000 </semantics></math>. A field pilot study at a limestone quarry further underscores its practical reliability and operational robustness. This research provides a precise, automated solution for real-time mining monitoring, effectively advancing sustainable and intelligent mining practices. Source code and datasets are publicly available at github.com/deemoe404/volcal_baseline.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70130","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144292491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Survey on Face-Swapping Methods for Identity Manipulation in Deepfake Applications 深度伪造应用中身份操纵的换脸方法研究
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-06-13 DOI: 10.1049/ipr2.70132
Ramamurthy Dhanyalakshmi, Gabriel Stoian, Daniela Danciulescu, Duraisamy Jude Hemanth
{"title":"A Survey on Face-Swapping Methods for Identity Manipulation in Deepfake Applications","authors":"Ramamurthy Dhanyalakshmi,&nbsp;Gabriel Stoian,&nbsp;Daniela Danciulescu,&nbsp;Duraisamy Jude Hemanth","doi":"10.1049/ipr2.70132","DOIUrl":"https://doi.org/10.1049/ipr2.70132","url":null,"abstract":"<p>A face-swapping framework is designed to generate an image or video that merges the pose and characteristics of the input image with the identity from the source image. It has found significant applications in entertainment, privacy protection and digital content creation. However, this process is inherently complex, involving challenges like identity preservation, expression consistency and photorealism. Despite the rapid advancements in face-swapping technology, there has been a noticeable lack of in-depth analysis of the intricate mechanisms and recent developments in this field. This work attempts to bridge that gap by providing an extensive overview of face-swapping methods based on deep learning. Researchers, developers and practitioners interested in learning about the state of face-swapping technology and its possible uses may find this survey to be an invaluable resource. It will provide insights that can inform future research and innovation in this fast-evolving area.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70132","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144273145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classification Algorithm for Sitting Postures Using Weighted Random Forest 基于加权随机森林的坐姿分类算法
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-06-10 DOI: 10.1049/ipr2.70126
Jaeeun Lee, Hongseok Choi, Jongnam Kim
{"title":"Classification Algorithm for Sitting Postures Using Weighted Random Forest","authors":"Jaeeun Lee,&nbsp;Hongseok Choi,&nbsp;Jongnam Kim","doi":"10.1049/ipr2.70126","DOIUrl":"https://doi.org/10.1049/ipr2.70126","url":null,"abstract":"<p>The increasing use of computers has led to a significant rise in neck and back disorders caused by poor sitting posture. While various posture analysis methods have been proposed to mitigate these issues, existing approaches are often limited by constrained data acquisition environments, low accuracy, and restricted posture classification capabilities. In this paper, we propose a method for classifying sitting postures that negatively impact health. By capturing front-facing images and detecting the coordinates and angles of the face and shoulders, our method utilises a random forest algorithm for posture classification. As a result of the experiment, the proposed approach achieved high performance with an accuracy, TPR, FPR, and F1-score of 0.983, 0.988, 0.004, and 0.983, respectively, outperforming previous studies.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70126","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144256405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A DNeRF Image Denoising Method Based on MSAF-DT 基于MSAF-DT的DNeRF图像去噪方法
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-06-08 DOI: 10.1049/ipr2.70122
Wenxuan Xu, Meng Huang, Qian Xu
{"title":"A DNeRF Image Denoising Method Based on MSAF-DT","authors":"Wenxuan Xu,&nbsp;Meng Huang,&nbsp;Qian Xu","doi":"10.1049/ipr2.70122","DOIUrl":"https://doi.org/10.1049/ipr2.70122","url":null,"abstract":"<p>Rendering novel and realistic images is crucial in applications such as augmented reality, virtual reality, 3D content creation, gaming, and the film industry. However, dynamic image rendering often suffers from significant noise, which compromises clarity and realism. Dynamic-Neural Radiance Fields (D-NeRF), an extension of the original NeRF model, addresses this challenge by enabling the rendering of dynamic images. Despite its advantages, D-NeRF often generates significant noise in the rendered images. Addressing this limitation, this paper proposes a Transformer-based model, Multi-Scale Attention Fusion Denoise Transformer (MSAF-DT), designed to enhance the clarity of rendered images. MSAF-DT constructs a deep neural network by stacking multiple Transformer blocks, with each block adaptively extracting complex features and dependencies from the data. The multi-head self-attention (MHSA) mechanism effectively captures long-range dependencies, which is crucial for processing sequences in dynamic radiance fields. Additionally, the model supports parallel processing of the entire sequence, significantly enhancing training efficiency. This design enables MSAF-DT to handle the noise present in D-NeRF outputs while preserving essential features. Experimental results on the Nerf_Synthetic dataset demonstrate that the proposed method outperforms D-NeRF in both image clarity and processing efficiency, achieving higher PSNR scores and faster convergence during training.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70122","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144244217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improvement of Dam Crack Detection Algorithm for YOLOv9 基于YOLOv9的大坝裂缝检测算法改进
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-06-08 DOI: 10.1049/ipr2.70124
Huixia Zhang, Xuhui Jiang, Yitong Liu, JinHua Qian, Lixue Ni
{"title":"Improvement of Dam Crack Detection Algorithm for YOLOv9","authors":"Huixia Zhang,&nbsp;Xuhui Jiang,&nbsp;Yitong Liu,&nbsp;JinHua Qian,&nbsp;Lixue Ni","doi":"10.1049/ipr2.70124","DOIUrl":"https://doi.org/10.1049/ipr2.70124","url":null,"abstract":"<p>Dams, as crucial water conservancy engineering facilities, play a role in safe guarding people's livelihoods and providing economic benefits. However, due to the impact of natural factors and human activities, dams may develop cracks and other potential safety hazards during operation. Crack detection can identify these potential issues in a timely manner, allowing for appropriate measures to be taken for repair and reinforcement, thereby preventing catastrophic consequences such as dam breaches under extreme weather or geological conditions. In the process of dam crack detection, this paper presents a method, YOLOv9-LAE, which may solve missed or false detections. Firstly, the large separable kernel attention (LSKA) module is introduced, which emphasises positional information while focusing on channel features. Secondly, the SPPFELAN in YOLOV9 is replaced by the AIFI module, as capturing the key information needed in the image will enable the following modules to accurately detect the crack information. Finally, the EIOU to calculate the loss, accelerating training convergence and improving the accuracy of crack detection. The research results indicate that YOLOV9-LAE achieves a precision of 90.7%, the recall rate is 75.1%, with <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>m</mi>\u0000 <mi>A</mi>\u0000 <mi>P</mi>\u0000 <mo>@</mo>\u0000 <mn>0.5</mn>\u0000 </mrow>\u0000 <annotation>[email protected]$</annotation>\u0000 </semantics></math> at 81.5% and <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>m</mi>\u0000 <mi>A</mi>\u0000 <mi>P</mi>\u0000 <mo>@</mo>\u0000 <mn>0.5</mn>\u0000 <mo>:</mo>\u0000 <mn>0.95</mn>\u0000 </mrow>\u0000 <annotation>[email protected]:0.95$</annotation>\u0000 </semantics></math> at 60.6%. Compared to YOLOv9, the precision has improved by 9.9%, the recall has increased by 2%, <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>m</mi>\u0000 <mi>A</mi>\u0000 <mi>P</mi>\u0000 <mo>@</mo>\u0000 <mn>0.5</mn>\u0000 </mrow>\u0000 <annotation>[email protected]$</annotation>\u0000 </semantics></math> has risen by 1.5% and <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>m</mi>\u0000 <mi>A</mi>\u0000 <mi>P</mi>\u0000 <mo>@</mo>\u0000 <mn>0.5</mn>\u0000 <mo>:</mo>\u0000 <mn>0.95</mn>\u0000 </mrow>\u0000 <annotation>[email protected]:0.95$</annotation>\u0000 </semantics></math> has been enhanced by 1.5%.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70124","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144244215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characterizing Natural Adversarial Examples Through Activation Map Analysis 通过激活图分析表征自然对抗实例
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-06-08 DOI: 10.1049/ipr2.70123
Anibal Pedraza, Nerea Leon, Harbinder Singh, Oscar Deniz, Gloria Bueno
{"title":"Characterizing Natural Adversarial Examples Through Activation Map Analysis","authors":"Anibal Pedraza,&nbsp;Nerea Leon,&nbsp;Harbinder Singh,&nbsp;Oscar Deniz,&nbsp;Gloria Bueno","doi":"10.1049/ipr2.70123","DOIUrl":"https://doi.org/10.1049/ipr2.70123","url":null,"abstract":"<p>Adversarial examples are an intriguing and critical topic in the field of machine learning. The impact of malignant perturbations on deep learning-based systems, especially in safety-critical applications, highlights a significant security concern. While most research has focused on artificially generated adversarial attacks–crafted through optimization algorithms and constrained perturbations, it is important to note that adversarial examples can also occur naturally, without any artificial manipulation, during the prediction of real-world images. These naturally occurring adversarial examples pose unique challenges, as they are harder to detect and interpret. Despite their importance, the study of natural adversarial examples remains in its early stages. Fundamental questions remain unanswered: Do natural adversarial examples exhibit similar behaviours or properties as artificially generated ones? How should models be adapted to improve their robustness against such natural inputs? To address these questions, this work proposes an in-depth analysis of activation maps to compare the internal behaviour of neural networks when processing clean images, artificially perturbed inputs and natural adversarial examples. A set of quantitative metrics is extracted from activation heatmaps at various network layers, including mean activation intensity, centroid displacement and standard reference image quality metrics. These measurements enable a systematic comparison of how the network attends to different image regions under varying conditions. The experimental results demonstrate that natural adversarial examples exhibit statistically significant differences in activation patterns compared to their artificial counterparts, suggesting that they may require distinct strategies for detection and defence.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70123","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144244216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unpaired Fundus Image Enhancement Using Image Decomposition 基于图像分解的非配对眼底图像增强
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-06-05 DOI: 10.1049/ipr2.70116
Kun Chen, Yu Ye, Huazhu Fu, Yuhao Luo, Ronald X. Xu, Mingzhai Sun
{"title":"Unpaired Fundus Image Enhancement Using Image Decomposition","authors":"Kun Chen,&nbsp;Yu Ye,&nbsp;Huazhu Fu,&nbsp;Yuhao Luo,&nbsp;Ronald X. Xu,&nbsp;Mingzhai Sun","doi":"10.1049/ipr2.70116","DOIUrl":"https://doi.org/10.1049/ipr2.70116","url":null,"abstract":"<p>Low-quality fundus images pose significant challenges for both ophthalmologists and computer-aided diagnosis systems. While many existing deep learning-based image quality enhancement algorithms require low- and high-quality image pairs for training, such pairs are often difficult to obtain in practice. On the other hand, unpaired image enhancement algorithms tend to struggle in preserving small structures and suppressing artefacts, which are crucial for medical applications. To address these issues, we propose an unpaired structure-preserving cycle quality alternating network for low-quality fundus image enhancement. Our method consists of three main components: (1) a cycle quality alternating framework to provide pixel-wise supervision for unpaired image enhancement, (2) a quality-aware disentangle module to enhance the extrinsic representation of the low-quality image with the high-quality reference image, and (3) an instance normalized skip to improve the network's structure-preserving capability. We tested our method on both synthetic and authentic clinical images with pathological structures and found it to be superior to state-of-the-art algorithms in terms of improving image quality while preserving delicate structures. Additionally, the proposed network demonstrated strong generalization ability in improving the quality of unseen images, as tested on 135-degree neonatal fundus images.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70116","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144220254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信