IET Image Processing最新文献_第2页

FUSION: Uncertainty-Guided Federated Semi-Supervised Learning for Medical Image Segmentation 融合：不确定性引导联邦半监督学习医学图像分割

IF 2 4区计算机科学

IET Image Processing Pub Date : 2025-07-22 DOI: 10.1049/ipr2.70147

Abdul Raheem, Zhen Yang, Haiyang Yu, Malik Abdul Manan, Fahad Sabah, Shahzad Ahmed

{"title":"FUSION: Uncertainty-Guided Federated Semi-Supervised Learning for Medical Image Segmentation","authors":"Abdul Raheem, Zhen Yang, Haiyang Yu, Malik Abdul Manan, Fahad Sabah, Shahzad Ahmed","doi":"10.1049/ipr2.70147","DOIUrl":"https://doi.org/10.1049/ipr2.70147","url":null,"abstract":"Federated learning (FL) for medical image segmentation poses critical challenges, including non-IID data distributions, limited access to labelled annotations, and stringent privacy constraints across institutions. To address these, we propose FUSION (Federated Unified Semi-Supervised Optimisation Network), a novel dual-path training framework that integrates both Federated Labelled Data Learning (FLDL) and Federated Unlabelled Data Training (FUDT). Central to FUSION is a two-stage pseudo-label refinement strategy designed to ensure robustness under real-world federated constraints. First, synthetic label denoising is performed using Monte Carlo dropout-based uncertainty estimation, enabling clients to identify and exclude low-confidence predictions. Second, prototype-based correction is applied to further refine pseudo-labels by aligning them with class-specific feature centroids, mitigating errors caused by domain shifts and inter-client variability. These refined labels are used for localised training on unlabelled clients, while a dynamic aggregation scheme modulated by a reliability-based hyperparameter μ adjusts the influence of labelled versus unlabelled clients during global model updates. This tightly coupled interaction between pseudo-label quality and federated optimisation ensures stability, accelerates convergence, and enhances generalisation across heterogeneous clients. FUSION is evaluated on three diverse datasets: TCGA-LGG (brain MRI), Kvasir-SEG (colonoscopy), and UDIAT (ultrasound) and consistently outperforms state-of-the-art FL models in Dice, IoU, HD95, and ASD metrics. Results confirm the critical role of synthetic label refinement in enhancing segmentation accuracy, boundary precision, and model scalability. FUSION provides a technically grounded, privacy-preserving, and label-efficient solution for real-world multi-institutional medical image segmentation tasks.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70147","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144672633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Review of Deep Learning-Based Medical Image Segmentation 基于深度学习的医学图像分割研究进展

IF 2 4区计算机科学

IET Image Processing Pub Date : 2025-07-21 DOI: 10.1049/ipr2.70163

Xinyue Zhang, Jianfeng Wang, Xiaochun Cheng, Junran Li

{"title":"A Review of Deep Learning-Based Medical Image Segmentation","authors":"Xinyue Zhang, Jianfeng Wang, Xiaochun Cheng, Junran Li","doi":"10.1049/ipr2.70163","DOIUrl":"https://doi.org/10.1049/ipr2.70163","url":null,"abstract":"Medical image segmentation, the process of precisely delineating regions of interest (e.g. organs, lesions, cells) within medical images, is a pivotal technique in medical image analysis. It finds widespread application in computer-aided diagnosis, surgical planning, radiation therapy, and pathological analysis, thus playing a crucial role in enabling precision medicine and enhancing the quality of clinical care. Traditional medical image segmentation methods often rely on hand-crafted features and rule-based approaches, which struggle to handle the inherent complexity and variability of medical imagery, leading to limitations in segmentation accuracy and robustness. Recently, deep learning methodologies, driven by their powerful capabilities in automatic feature learning and non-linear modelling, have overcome the limitations of traditional methods and achieved significant advancements in the field of medical image segmentation. This review provides a comprehensive overview and summary of recent progress in deep learning-based medical image segmentation, with a particular focus on fully supervised learning paradigms leveraging convolutional neural networks, transformers, and the segment anything model. We delve into the underlying principles, network architectures, advantages, and limitations of these approaches. Furthermore, we systematically compare their performance across diverse imaging modalities, anatomical structures, and pathological targets. We also present a curated compilation of commonly used datasets, evaluation metrics, and loss functions relevant to medical image segmentation. Finally, we discuss future research directions and potential challenges, offering insights into the evolving landscape of this critical field.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70163","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144666594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive Logit Reconstruction in Knowledge Distillation 知识蒸馏中的自适应Logit重构

IF 2 4区计算机科学

IET Image Processing Pub Date : 2025-07-18 DOI: 10.1049/ipr2.70162

Han Chen, Cunkang Wu, Meng Han, Xuyang Teng

{"title":"Adaptive Logit Reconstruction in Knowledge Distillation","authors":"Han Chen, Cunkang Wu, Meng Han, Xuyang Teng","doi":"10.1049/ipr2.70162","DOIUrl":"https://doi.org/10.1049/ipr2.70162","url":null,"abstract":"In the logit-based knowledge distillation method, the student model learns the classification information of the teacher network by transmitting high-dimensional and abstract logits. Nevertheless, the teacher network is not an optimal learning target. On common datasets such as CIFAR100 and ImageNet, the majority of models exhibit classification accuracies of only 60% to 80%. These errors in the teacher models are a significant part of knowledge distillation that cannot be ignored. In order to facilitate the acquisition of more accurate knowledge by students, we propose the implementation of adaptive logit reconstruction knowledge distillation (ALRKD). ALRKD corrects errors by using the standard deviation, which represents the fluctuation degree of the logit distribution. Furthermore, in order to compensate for the loss of information that occurs during the correction process, an additional branch is designed to provide supplementary knowledge regarding the relationships between other classes. The results of several experiments on common datasets demonstrate the significant superiority of ALRKD.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70162","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144647424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Spatial and Global Correlation-Aware Network for Multiple Sclerosis Lesion Segmentation from Multi-Modal MR Images 多模态磁共振图像中多发性硬化症病灶分割的空间和全局关联感知网络

IF 2 4区计算机科学

IET Image Processing Pub Date : 2025-07-17 DOI: 10.1049/ipr2.70164

Zhanlan Chen, Xiuying Wang, Jing Huang, Jie Lu, Jiangbin Zheng

{"title":"A Spatial and Global Correlation-Aware Network for Multiple Sclerosis Lesion Segmentation from Multi-Modal MR Images","authors":"Zhanlan Chen, Xiuying Wang, Jing Huang, Jie Lu, Jiangbin Zheng","doi":"10.1049/ipr2.70164","DOIUrl":"https://doi.org/10.1049/ipr2.70164","url":null,"abstract":"Multiple sclerosis (MS) lesion segmentation from MR imaging is a prerequisite step in clinical diagnosis and treatment of brain diseases. However, automated segmentation of MS lesions remains a challenging task, owing to the variant morphology and uncertain distribution of lesions across subjects. Despite the achieved success by existing methods, two problems still persist in automated segmentation of MS lesions, namely the lack of an effective feature enhancement approach for capturing locality context and the lack of global coherence in prediction for pixels. Hence, we propose a correlation learning network for both local and global context in this work. Specifically, we propose a sparse spatial correlation module to learn the spatial correlations within neighbours for local context. Besides, we propose a global coherence module to encode long-range dependencies for global context. The proposed method is evaluated on a public ISBI2015 datatset and a private in-house dataset collected from hospital. Experimental results show the competitive performance of our method against state-of-the-art methods.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70164","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144647571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spatiotemporal Context Adapting Framework for Visual Object Tracking 视觉目标跟踪的时空上下文自适应框架

IF 2 4区计算机科学

IET Image Processing Pub Date : 2025-07-16 DOI: 10.1049/ipr2.70150

Kunlong Zhao, Dawei Zhao, Xu Wang, Liang Xiao, Yulong Huang, Yiming Nie, Yonggang Zhang, Bin Dai

{"title":"Spatiotemporal Context Adapting Framework for Visual Object Tracking","authors":"Kunlong Zhao, Dawei Zhao, Xu Wang, Liang Xiao, Yulong Huang, Yiming Nie, Yonggang Zhang, Bin Dai","doi":"10.1049/ipr2.70150","DOIUrl":"https://doi.org/10.1049/ipr2.70150","url":null,"abstract":"Visual object tracking is widely applied in intelligent transportation systems and visual surveillance systems that serve smart cities, as well as in autonomous vehicles. Existing methods usually utilise a relation-modelling framework to model the visual object tracking problem, with auxiliary spatial context and temporal information. The spatial context is often extracted by enlarging the target template, which can introduce more background and positional information. The temporal correlation is obtained by associating the search image with previous images. However, due to noise interference, existing methods often partially exploit auxiliary data, leading to underutilisation of spatiotemporal information. To address these issues, we propose a novel and concise tracking framework, uniformly encoding all auxiliary data, including the enlarged target template, previous images, and corresponding target bounding boxes. Specifically, to mitigate the unstable factors introduced by these raw inputs, we propose a spatiotemporal context adaptive encoder, which can adaptively select appropriate information in noisy data. Extensive experiments show that the proposed method achieves state-of-the-art performance on various benchmarks, demonstrating its superiority.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70150","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144635092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reversible Data Hiding via Bit-Plane Block Rearrangement and Intra-Block Compression Coding for Encrypted Images 基于位面块重排和块内压缩编码的加密图像可逆数据隐藏

IF 2 4区计算机科学

IET Image Processing Pub Date : 2025-07-16 DOI: 10.1049/ipr2.70158

Shuyi Deng, Nianqiao Li, Chunqiang Yu, Xianquan Zhang, Zhenjun Tang

{"title":"Reversible Data Hiding via Bit-Plane Block Rearrangement and Intra-Block Compression Coding for Encrypted Images","authors":"Shuyi Deng, Nianqiao Li, Chunqiang Yu, Xianquan Zhang, Zhenjun Tang","doi":"10.1049/ipr2.70158","DOIUrl":"https://doi.org/10.1049/ipr2.70158","url":null,"abstract":"Reversible data hiding in encrypted images (RDHEI) enables secret data embedding within encrypted images while allowing for the lossless recovery of the original image after data extraction. This technique holds significant applications in various domains such as cloud storage and data security. However, many existing RDHEI methods suffer from limited embedding capacity. To address this limitation, we present a novel and high capacity RDHEI algorithm via bit-plane block rearrangement and intra-block compression coding (hereafter BRBCC algorithm). First, the prediction error (PE) image is generated by using a median edge detection predictor, and the high-order zero-valued bit-planes are compressed. The non-zero-valued bit-planes are then separated into non-overlapping blocks that can be classified as all-zero blocks, embeddable blocks, or non-embeddable blocks. These blocks are then sorted and grouped in terms of block type for block coding. Finally, a new intra-block compression coding technique with small coded data for locating block elements is proposed to conduct effective compression and thereby reserve more space for embedding secret data. Experimental results indicate that the embedding rates of the BRBCC algorithm reach 3.9381 and 3.8436 bpp on the public datasets of BOSSbase and BOWS-2, respectively, outperforming some state-of-the-art RDHEI algorithms and exhibiting good application potential.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70158","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144635184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CR-YOLOv8-Based Detection Method for Identifying Non-Functional Satellite Components 基于cr - yolov8的卫星非功能部件识别方法

IF 2 4区计算机科学

IET Image Processing Pub Date : 2025-07-16 DOI: 10.1049/ipr2.70153

He Bian, Derui Zhang, Cheng Li, Zhe Zhang, Wenjie Liu, Jianzhong Cao, Chao Mei, Gaopeng Zhang

{"title":"CR-YOLOv8-Based Detection Method for Identifying Non-Functional Satellite Components","authors":"He Bian, Derui Zhang, Cheng Li, Zhe Zhang, Wenjie Liu, Jianzhong Cao, Chao Mei, Gaopeng Zhang","doi":"10.1049/ipr2.70153","DOIUrl":"https://doi.org/10.1049/ipr2.70153","url":null,"abstract":"Detecting non-functional satellite components is critical for on-orbit servicing. Current detection methods struggle with complex image noise, motion blur in space environments, and the limited realism of artificially synthesised sample data. To address these challenges, we propose an enhanced you only look once version 8 (YOLOv8)-based method. In terms of network architecture, we introduce innovative designs for the backbone and neck components. A novel hybrid attention mechanism replaces the conventional approach, improving the perception and processing of intricate image features and significantly enhancing feature extraction. Additionally, we integrate modules inspired by residual networks into the neck structure, improving training adaptability and ensuring robust information transmission. This design highlights key target features while minimising feature attenuation. We also establish the satellite key element (SAKE) dataset under simulated real space conditions, including image noise and jitter blur. This dataset features components such as satellite bodies and solar panels and uses an encoder–decoder network architecture to refine context information. By merging this with a branch preserving high-resolution details, we enhance dataset expressiveness. Experiments demonstrate that the enhanced algorithm achieves a mean average precision (mAP) of 78.98% on the SAKE dataset, a 2.57% improvement over the original YOLOv8. The refined model effectively detects critical satellite components, showing superior performance in noisy and blurry scenarios.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70153","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144635090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MBSM-Net: A Multi-Branch Structure Model for Pneumoconiosis Screening and Grading of Chest X-Ray Images MBSM-Net：用于尘肺筛查和胸部x线图像分级的多分支结构模型

IF 2 4区计算机科学

IET Image Processing Pub Date : 2025-07-16 DOI: 10.1049/ipr2.70128

Shuzhi Su, Yifan Wang, Yanmin Zhu, Yong Dai, Zekuan Yu, Zhi-Ri Tang, Bo Li, Shengzhi Wang

{"title":"MBSM-Net: A Multi-Branch Structure Model for Pneumoconiosis Screening and Grading of Chest X-Ray Images","authors":"Shuzhi Su, Yifan Wang, Yanmin Zhu, Yong Dai, Zekuan Yu, Zhi-Ri Tang, Bo Li, Shengzhi Wang","doi":"10.1049/ipr2.70128","DOIUrl":"https://doi.org/10.1049/ipr2.70128","url":null,"abstract":"Convolutional neural network (CNN)-based auxiliary diagnostic systems have been widely proposed. However, CNNs have limitations in perceiving global features and more subtle features, which makes existing methods unable to achieve ideal accuracy in tasks such as pneumoconiosis screening. To overcome these limitations, we propose MBSM-Net, a new multi-branch structure-enhanced model for pneumoconiosis screening and grading based on X-ray images. MBSM-Net introduces an adaptive feature selection and fusion module to achieve synchronous extraction and hierarchical fusion of global and local features. In the local feature extraction module, we designed a CNN-Mamba module. This module integrates prior information through a detailed enhancement module to compensate for the shortcomings of traditional convolutions and significantly enhances the expression of subtle lesion information. Meanwhile, the Mamba module simulates pixel-level long-range dependencies to extract finer-grained texture features. In the global feature extraction module, we cleverly utilize the windowed multi-head self-attention (W-MSA) mechanism, enabling the model to better understand the overall distribution and degree of fibrosis of pulmonary lesions. We validated the MBSM-Net model on 1,760 real anonymized patient X-ray chest films. The results showed that the accuracy of the MBSM-Net model reached 78.6%, and the F1 score reached 79%, both of which are superior to existing models.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70128","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144635224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Subspace-Guided Feature Reconstruction for Unsupervised Anomaly Localization 子空间引导下的无监督异常定位特征重构

IF 2 4区计算机科学

IET Image Processing Pub Date : 2025-07-16 DOI: 10.1049/ipr2.70157

Katsuya Hotta, Chao Zhang, Yoshihiro Hagihara, Takuya Akashi

{"title":"Subspace-Guided Feature Reconstruction for Unsupervised Anomaly Localization","authors":"Katsuya Hotta, Chao Zhang, Yoshihiro Hagihara, Takuya Akashi","doi":"10.1049/ipr2.70157","DOIUrl":"https://doi.org/10.1049/ipr2.70157","url":null,"abstract":"Unsupervised anomaly localization aims to identify anomalous regions that deviate from normal sample patterns. Most recent methods perform feature matching or reconstruction for the target sample with pre-trained deep neural networks. However, they still struggle to address challenging anomalies because the deep embeddings stored in the memory bank can be less powerful and informative. Specifically, prior methods often overly rely on the finite resources stored in the memory bank, which leads to low robustness to unseen targets. In this paper, we propose a novel subspace-guided feature reconstruction framework to pursue adaptive feature approximation for anomaly localization. It first learns to construct low-dimensional subspaces from the given nominal samples, and then learns to reconstruct the given deep target embedding by linearly combining the subspace basis vectors using the self-expressive model. Our core is that, despite the limited resources in the memory bank, the out-of-bank features can be alternatively “mimicked” to adaptively model the target. Moreover, we propose a sampling method that leverages the sparsity of subspaces and allows the feature reconstruction to depend only on a small resource subset, contributing to less memory overhead. Extensive experiments on three benchmark datasets demonstrate that our approach generally achieves state-of-the-art anomaly localization performance.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70157","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144635091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fuzzy-YOLO Model for Rail Anomaly Detection: Robustness Under Limited Sample and Interference Conditions 铁路异常检测的模糊yolo模型：有限样本和干扰条件下的鲁棒性

IF 2 4区计算机科学

IET Image Processing Pub Date : 2025-07-15 DOI: 10.1049/ipr2.70156

Liyuan Yang, Ming Yang, Ghazali Osman, Safawi Abdul Rahman, Muhammad Firdaus Mustapha

{"title":"Fuzzy-YOLO Model for Rail Anomaly Detection: Robustness Under Limited Sample and Interference Conditions","authors":"Liyuan Yang, Ming Yang, Ghazali Osman, Safawi Abdul Rahman, Muhammad Firdaus Mustapha","doi":"10.1049/ipr2.70156","DOIUrl":"https://doi.org/10.1049/ipr2.70156","url":null,"abstract":"Accurate detection of surface anomalies in railway tracks is critical for ensuring train operation safety and enabling intelligent railway management. However, the scarcity and pronounced imbalance of anomaly samples significantly constrain model training and generalisation. Moreover, complex environmental factors such as illumination variability, sensor noise, and motion blur pose additional challenges to model robustness in real-world applications. This study presents a Fuzzy-YOLO model tailored for limited sample datasets. Built upon YOLOv11, Fuzzy-YOLO incorporates a fuzzy-non-maximum suppression (NMS) mechanism and integrates a lightweight fuzzy residual neural network (RFNN-Res) module based on fuzzy logic for anomaly classification. The final anomaly type is determined via a weighted voting strategy. Experimental evaluations demonstrate that Fuzzy-YOLO achieves a mean average precision (mAP) of 98.90%, exhibiting notably enhanced stability compared to YOLOv11 under conditions of varying illumination, noise, and motion-induced blur.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70156","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144624264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0