IET Image Processing最新文献_第8页

Subspace-Guided Feature Reconstruction for Unsupervised Anomaly Localization 子空间引导下的无监督异常定位特征重构

IF 2.2 4区计算机科学

IET Image Processing Pub Date : 2025-07-16 DOI: 10.1049/ipr2.70157

Katsuya Hotta, Chao Zhang, Yoshihiro Hagihara, Takuya Akashi

{"title":"Subspace-Guided Feature Reconstruction for Unsupervised Anomaly Localization","authors":"Katsuya Hotta, Chao Zhang, Yoshihiro Hagihara, Takuya Akashi","doi":"10.1049/ipr2.70157","DOIUrl":"10.1049/ipr2.70157","url":null,"abstract":"Unsupervised anomaly localization aims to identify anomalous regions that deviate from normal sample patterns. Most recent methods perform feature matching or reconstruction for the target sample with pre-trained deep neural networks. However, they still struggle to address challenging anomalies because the deep embeddings stored in the memory bank can be less powerful and informative. Specifically, prior methods often overly rely on the finite resources stored in the memory bank, which leads to low robustness to unseen targets. In this paper, we propose a novel subspace-guided feature reconstruction framework to pursue adaptive feature approximation for anomaly localization. It first learns to construct low-dimensional subspaces from the given nominal samples, and then learns to reconstruct the given deep target embedding by linearly combining the subspace basis vectors using the self-expressive model. Our core is that, despite the limited resources in the memory bank, the out-of-bank features can be alternatively “mimicked” to adaptively model the target. Moreover, we propose a sampling method that leverages the sparsity of subspaces and allows the feature reconstruction to depend only on a small resource subset, contributing to less memory overhead. Extensive experiments on three benchmark datasets demonstrate that our approach generally achieves state-of-the-art anomaly localization performance.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70157","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144635091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fuzzy-YOLO Model for Rail Anomaly Detection: Robustness Under Limited Sample and Interference Conditions 铁路异常检测的模糊yolo模型：有限样本和干扰条件下的鲁棒性

IF 2.2 4区计算机科学

IET Image Processing Pub Date : 2025-07-15 DOI: 10.1049/ipr2.70156

Liyuan Yang, Ming Yang, Ghazali Osman, Safawi Abdul Rahman, Muhammad Firdaus Mustapha

{"title":"Fuzzy-YOLO Model for Rail Anomaly Detection: Robustness Under Limited Sample and Interference Conditions","authors":"Liyuan Yang, Ming Yang, Ghazali Osman, Safawi Abdul Rahman, Muhammad Firdaus Mustapha","doi":"10.1049/ipr2.70156","DOIUrl":"10.1049/ipr2.70156","url":null,"abstract":"Accurate detection of surface anomalies in railway tracks is critical for ensuring train operation safety and enabling intelligent railway management. However, the scarcity and pronounced imbalance of anomaly samples significantly constrain model training and generalisation. Moreover, complex environmental factors such as illumination variability, sensor noise, and motion blur pose additional challenges to model robustness in real-world applications. This study presents a Fuzzy-YOLO model tailored for limited sample datasets. Built upon YOLOv11, Fuzzy-YOLO incorporates a fuzzy-non-maximum suppression (NMS) mechanism and integrates a lightweight fuzzy residual neural network (RFNN-Res) module based on fuzzy logic for anomaly classification. The final anomaly type is determined via a weighted voting strategy. Experimental evaluations demonstrate that Fuzzy-YOLO achieves a mean average precision (mAP) of 98.90%, exhibiting notably enhanced stability compared to YOLOv11 under conditions of varying illumination, noise, and motion-induced blur.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70156","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144624264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Comprehensive Survey of Advancement in Lip Reading Models: Techniques and Future Directions 唇读模型研究进展综述：技术与未来发展方向

IF 2.2 4区计算机科学

IET Image Processing Pub Date : 2025-07-14 DOI: 10.1049/ipr2.70095

Sampada Deshpande, Kalyani Shirsath, Amey Pashte, Pratham Loya, Sandip Shingade, Vijay Sambhe

引用次数: 0

DeepFake Detection: Evaluating the Performance of EfficientNetV2-B2 on Real vs. Fake Image Classification 深度假检测：评估effentnetv2 - b2在真假图像分类上的性能

IF 2.2 4区计算机科学

IET Image Processing Pub Date : 2025-07-14 DOI: 10.1049/ipr2.70152

Surbhi Bhatia Khan, Muskan Gupta, Bakkiyanathan Gopinathan, Mahesh Thyluru RamaKrishna, Mo Saraee, Arwa Mashat, Ahlam Almusharraf

{"title":"DeepFake Detection: Evaluating the Performance of EfficientNetV2-B2 on Real vs. Fake Image Classification","authors":"Surbhi Bhatia Khan, Muskan Gupta, Bakkiyanathan Gopinathan, Mahesh Thyluru RamaKrishna, Mo Saraee, Arwa Mashat, Ahlam Almusharraf","doi":"10.1049/ipr2.70152","DOIUrl":"10.1049/ipr2.70152","url":null,"abstract":"The surge in digitally altered images has necessitated advanced solutions for reliable image verification, impacting sectors from media to cybersecurity. This work provides an effective method of real vs. deepfake image distinction through utilization of the EfficientNetV2-B2 model, the latest in convolutional neural networks known for its accuracy and effectiveness. The research utilized a big dataset of 100,000 images equally divided between deepfake and real classes to create a balanced sample. The methodology involved preprocessing images to a fixed size, utilizing augmentation techniques to enhance model robustness, and employing a systematic training schedule along with accuracy parameter optimization. Significantly, the research utilized an automated learning rate adjustment mechanism to optimize training performance, contributing to a complex model calibration. Outcome of the experiment design was showing 99.89% classification accuracy and an equally impressive F1 score, which is a measure of the efficiency of the model in identifying deepfakes. The results provided in-depth analysis with some misclassifications, providing recommendations for potential image processing and model training improvements. The outcome points to the suitability of applying EfficientNetV2-B2 where there is a requirement for high accuracy in image authentication.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70152","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A High-Speed Dynamic Measurement Method for Checked Luggage Dimensions 托运行李尺寸的高速动态测量方法

IF 2.2 4区计算机科学

IET Image Processing Pub Date : 2025-07-14 DOI: 10.1049/ipr2.70148

Yuzhou Chen, Bin Zhang, Hongqing Song, Mingqian Du

{"title":"A High-Speed Dynamic Measurement Method for Checked Luggage Dimensions","authors":"Yuzhou Chen, Bin Zhang, Hongqing Song, Mingqian Du","doi":"10.1049/ipr2.70148","DOIUrl":"10.1049/ipr2.70148","url":null,"abstract":"Ensuring compliance with stringent luggage size regulations is critical for operational efficiency and cost control in modern airports. However, conventional measurement methods often face a trade-off between speed and accuracy in the dynamic environment of check-in counters. To address these limitations, we propose a real-time luggage dimension and orientation measurement system based on a single RGB-D camera and the YOLOv8 object detection model. As luggage travels at 0.75 m/s along a conveyor, the system first detects and classifies each item, then combines two-dimensional image analysis with three-dimensional point cloud processing to compute length, width, height, and deflection angle. Trained on 7000 annotated images and validated on 100 physical samples, our method achieves average dimensional errors below 4% and angular deviations within 3°, with a mean processing time of 40 ms per item. Comparative experiments demonstrate that, under similar computational constraints, the proposed approach outperforms traditional techniques in both accuracy and robustness, thereby offering a reliable solution for enhancing real-time luggage assessment at airport check-in terminals.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70148","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Attention Augmentation-Based Transformer Network for Unsupervised Medical Image Registration 基于注意力增强的无监督医学图像配准变压器网络

IF 2.2 4区计算机科学

IET Image Processing Pub Date : 2025-07-14 DOI: 10.1049/ipr2.70154

Chuanhui Li, Hao Wang, Hangyu Bai, Xin Sun, Tao Zhang

引用次数: 0

DF-3DNet: A Lightweight Approach Based on Deep Learning for 3D Telecommunication Tower Asset Classification DF-3DNet：一种基于深度学习的轻量级3D电信塔资产分类方法

IF 2.2 4区计算机科学

IET Image Processing Pub Date : 2025-07-11 DOI: 10.1049/ipr2.70149

Amzar Omairi, Zool Hilmi Ismail, Gianmarco Goycochea Casas

{"title":"DF-3DNet: A Lightweight Approach Based on Deep Learning for 3D Telecommunication Tower Asset Classification","authors":"Amzar Omairi, Zool Hilmi Ismail, Gianmarco Goycochea Casas","doi":"10.1049/ipr2.70149","DOIUrl":"10.1049/ipr2.70149","url":null,"abstract":"The transition from 4G to 5G communication systems and the phase-out of 3G equipment have increased the demand for efficient telecommunication tower inspection and maintenance. Traditional manual methods are time-consuming and risky, prompting the adoption of unmanned aerial vehicles (UAVs) equipped with LiDAR sensors. This research introduces a framework for telecommunication tower asset inspection, utilising a lightweight, deep learning-based 3D classifier called DF-3DNet. The process involves raw 3D point cloud data collection using DJI's Zenmuse L1 LiDAR, optimal flight planning, data pre-processing, augmentation, and classification. The study focuses on two key asset classes—radio frequency (RF) panels and microwave (MW) dishes—which are prevalent in telecommunication towers. DF-3DNet, an enhanced version of PointNet, incorporates advanced data augmentation methods and class balance compensation to optimise performance, particularly when working with limited datasets. The model achieved classification accuracies of 0.6613 on ScanObjectNN, 0.8171 on ModelNet40, and 0.869 on the telecommunication tower dataset, demonstrating its effectiveness in handling noisy, small-scale data. By streamlining inspection workflows and leveraging AI-driven classification, this framework significantly reduces costs, time, and risks associated with traditional methods, paving the way for scalable, real-time telecommunication tower asset management.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70149","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing Fetal Plane Classification Accuracy With Data Augmentation Using Diffusion Models 利用扩散模型增强数据，提高胎儿平面分类精度

IF 2.2 4区计算机科学

IET Image Processing Pub Date : 2025-07-07 DOI: 10.1049/ipr2.70151

Yueying Tian, Elif Ucurum, Xudong Han, Rupert Young, Chris Chatwin, Philip Birch

引用次数: 0

A Systematic Review on Cell Nucleus Instance Segmentation 细胞核实例分割的系统综述

IF 2.2 4区计算机科学

IET Image Processing Pub Date : 2025-07-07 DOI: 10.1049/ipr2.70129

Yulin Chen, Qian Huang, Meng Geng, Zhijian Wang, Yi Han

{"title":"A Systematic Review on Cell Nucleus Instance Segmentation","authors":"Yulin Chen, Qian Huang, Meng Geng, Zhijian Wang, Yi Han","doi":"10.1049/ipr2.70129","DOIUrl":"10.1049/ipr2.70129","url":null,"abstract":"Cell nucleus instance segmentation plays a pivotal role in medical research and clinical diagnosis by providing insights into cell morphology, disease diagnosis, and treatment evaluation. Despite significant efforts from researchers in this field, there remains a lack of a comprehensive and systematic review that consolidates the latest advancements and challenges in this area. In this survey, we offer a thorough overview of existing approaches to nucleus instance segmentation, exploring both traditional and deep learning-based methods. Traditional methods include watershed, thresholding, active contour model, and clustering algorithms, while deep learning methods include one-stage methods and two-stage methods. For these methods, we examine their principles, procedural steps, strengths, and limitations, offering guidance on selecting appropriate techniques for different types of data. Furthermore, we comprehensively investigate the formidable challenges encountered in the field, including ethical implications, robustness under varying imaging conditions, computational constraints, and the scarcity of annotated data. Finally, we outline promising future directions for research, such as privacy-preserving and fair AI systems, domain generalization and adaptation, efficient and lightweight model design, learning from limited annotations, as well as advancing multimodal segmentation models.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70129","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144574128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improved Image Denoising: A Combination Method Using Multiscale Contextual Fusion and Recursive Learning 改进的图像去噪：一种基于多尺度上下文融合和递归学习的组合方法

IF 2.2 4区计算机科学

IET Image Processing Pub Date : 2025-06-30 DOI: 10.1049/ipr2.70143

Sonia Rehman, Muhammad Habib, Aftab Farrukh, Aarif Alutaybi

{"title":"Improved Image Denoising: A Combination Method Using Multiscale Contextual Fusion and Recursive Learning","authors":"Sonia Rehman, Muhammad Habib, Aftab Farrukh, Aarif Alutaybi","doi":"10.1049/ipr2.70143","DOIUrl":"10.1049/ipr2.70143","url":null,"abstract":"The exponential growth of imaging technology has led to a surge in visual content creation, necessitating advanced image denoising algorithms. Conventional methods, which frequently rely on predefined rules and filters, are inadequate for managing intricate noise patterns while maintaining image features. In order to tackle the issue of real-world image denoising, we investigate and integrate a new novel technique named recursive context fusion network (RCFNet) employing a deep convolutional neural network, demonstrating superior performance compared to current state-of-the-art approaches. RCFNet consists of a coarse feature extraction module and a reconstruction unit, where the former provides a broad contextual understanding and the latter refines the denoising output by preserving spatial and contextual details. Deep CNN learns features instead of using conventional methods, allowing us to improve and refine images. Dual attention units (DUs), in conjunction with the multi-scale resizing Block (MSRB) and selective kernel feature fusion (SKFF), are incorporated into the network to ensure efficient and reliable feature extraction. To demonstrate the advantages and challenges of combining many configurations into a single pipeline, we take a more detailed look at the results. By leveraging the complementary properties of these networks and computational models, we prefer to contribute to the creation of techniques that enhance image restoration while preserving crucial information, therefore encouraging further research and applications in image processing and artificial intelligence. The RCFNet achieves a high structural similarity index (SSIM) of 0.98 and a peak signal-to-noise ratio (PSNR) of 43.4 dB, outperforming many state-of-the-art methods on two benchmark datasets (DND and SIDD) and demonstrating its superior real-world image denoising ability.","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70143","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0