{"title":"A Review of the Image Segmentation Methods Using Rough Sets","authors":"Yuanyuan Tian, Hongliang Wang, Xiaolong Zhu, Haitao Guo","doi":"10.1049/ipr2.70141","DOIUrl":"10.1049/ipr2.70141","url":null,"abstract":"<p>Image segmentation is a major problem in image processing, and at the same time, it is a classical problem. Rough set theory is a set of theories that study the representation, learning, and induction of incomplete data imprecise knowledge, and so on. Rough set theory has good applicability in image segmentation because of its good ability to deal with vague and uncertain problems and its characteristics of fast convergence and avoiding local minima in solving optimization problems. The main content of this paper is to review the existing methods for image segmentation based on rough sets, categorize them, and describe the main ideas, advantages, disadvantages, and conditions of use of each method. Some of the methods for image segmentation based on rough sets utilize only rough sets, but most of them combine rough sets with other theories or methods. Therefore, this paper classifies existing methods for image segmentation based on rough sets according to whether they combine with other theories or methods, and what kind of theories or methods they combine with. This paper also provides an outlook on the development trends of the methods for image segmentation based on rough sets. This paper is written with the aim of making the researchers who are engaged in the methods for image segmentation based on rough sets understand the current status of the research works in this field within a short time.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70141","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144339140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lili Meng, Zhiyi Lu, Huizhong Xue, Kexin Shi, Kai Zhou
{"title":"A New Weighted Nuclear Norm Regularization Model for Removing Salt and Pepper Noise With Applications","authors":"Lili Meng, Zhiyi Lu, Huizhong Xue, Kexin Shi, Kai Zhou","doi":"10.1049/ipr2.70138","DOIUrl":"10.1049/ipr2.70138","url":null,"abstract":"<p>The application of weighted kernel norm to image denoising has gained significant research interest in recent years by using the non-local self-similarity of images. In this paper, we propose a novel model for removing salt and pepper noise that integrates weighted kernel norm with higher-order total variation regularization. Subsequently, we use the classical method of alternating direction of multipliers and introduce some auxiliary variables to transform the original problem into saddle point problem. To illustrate the analytical results, a series of numerical simulations are conducted. Finally, experimental comparisons demonstrate the superior performance of the proposed model, which outperforms other competitive methods in terms of both signal-to-noise ratio and structural similarity index.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70138","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144323419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CDRWF: Compressed Domain Based Robust Watermarking Framework for Colored Images","authors":"Samrah Mehraj, Subreena Mushtaq, Shabir A. Parah","doi":"10.1049/ipr2.70109","DOIUrl":"10.1049/ipr2.70109","url":null,"abstract":"<p>As the volume of digital data increases, there is an increasing need for effective compression methods to address storage demands. Concurrently, the importance of robust image watermarking for authentication and ownership verification cannot be overstated. This work tackles the dual challenge of optimizing image compression for storage conservation and implementing strong image watermarking for copyright protection. The suggested approach integrates the K-means clustering compression algorithm to enhance storage efficiency along with a resilient image watermarking technique based on spatial-domain embedding. We introduce a blind robust watermarking approach that uses zero-frequency coefficient alteration independently in the spatial domain instead of using the discrete cosine transformation (DCT) to verify the ownership of colored images. To enhance the robustness of the system, we have incorporated two watermarks into the cover image. This precaution ensures that even if one watermark undergoes deterioration due to attacks, authentication can still be assured by recovering the other watermark. Compared to frequency-domain approaches, our scheme yields better robustness and reduced computing complexity. The average peak signal-to-noise ratio (PSNR) for the test images using our approach is above 39 dB with a compression ratio equal to 5.9978, removing up to 83% of the redundancy of the host image. After comparing our approach with several state-of-the-art methods, its robustness is exposed by the values of normalized correlation coefficient (NCC) close to one and bit error rate (BER) values close to zero. Besides, the scheme is able to embed a total of 8192 watermark bits in the host image of size 512 × 512 × 3. Experimental results affirm the effectiveness of the proposed methodology, marking it as a valuable contribution to the domains of image processing and information security.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70109","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144323420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image Compression Algorithm Based on Region of Interest Extraction for Unmanned Aerial Vehicles Communication","authors":"Yanxia Liang, Tong Jia, Xin Liu, Huanhuan Zhang","doi":"10.1049/ipr2.70137","DOIUrl":"10.1049/ipr2.70137","url":null,"abstract":"<p>Unmanned aerial vehicles (UAVs) are widely used but face challenges of limited storage and bandwidth. In this research, we propose an image compression algorithm tailored for UAV communication, termed region of interest extraction for UAV communication (ROIE-UC). First, image pixels are clustered into super pixel blocks using the simple linear iterative clustering (SLIC). Second, these super pixels are grouped into regions of interest (ROI) using the density-based spatial clustering of applications with noise (DBSCAN). The image is then segmented into ROI and non-ROI areas based on these clusters. Lossless compression is applied to the ROI, while lossy compression with a high ratio is used for non-ROI regions. At the receiving end, the image is decompressed and reconstructed. Experiments show ROIE-UC gets a peak signal-to-noise ratio (PSNR) of 46.37 dB and an feature similarity index (FSIM) of 99.99% for ROI. It outperforms JPEG in PSNR (up to 28.52% improvement), FSIM (0.15% improvement), and compression ratio. When PSNR and FSIM are similar, its max compression ratio is 5.89 times that of JPEG. It also has up to 51.49% higher PSNR than other methods. ROIE-UC is an effective solution for UAV image processing and data compression.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70137","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144323421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An X-Ray Contraband Detection Method Based on Improved YOLOv8","authors":"Jianing Chen, Juan Hao, Xiaoqun Liu","doi":"10.1049/ipr2.70135","DOIUrl":"10.1049/ipr2.70135","url":null,"abstract":"<p>X-ray detection of contraband is crucial for public safety; however, it often faces challenges due to cluttered backgrounds and overlapping objects in security inspection images. This study proposes a novel detection framework based on You Only Look Once version 8 (YOLOv8), incorporating three key innovations: multi-scale cross-axis attention (MCA), which captures global dependencies through horizontal and vertical collaborative attention, effectively mitigating irrelevant features in complex X-ray scenarios; a lightweight bottleneck architecture using partial convolution (PConv), which significantly reduces floating point operations (FLOPs) while preserving positional sensitivity; and the focal-enhanced intersection over union (Focaler-IoU) loss function, which dynamically weights difficult samples to enhance regression accuracy. Experiments on the prohibited item detection in the X-ray dataset revealed that our model achieves a mean average precision (IoU = 0.5) ([email protected]) of 97.3%, outperforming YOLOv8s by 1.2 percentage points, and maintains real-time performance of 121 frames per second, surpassing YOLOv10-S (96.5%) and YOLOv12-S (96.8%). Ablation studies highlight the contribution of each module: MCA enhances mAP by 0.7%, PConv decreases FLOPs by 31%, and Focaler-IoU increases precision by 0.9% and recall by 2.4%. The proposed method exhibits substantial potential for real-time security inspections.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70135","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144300336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rongrong Gong, Jiahao Chen, Dengyong Zhang, Arun Kumar Sangaiah, Mohammed J. F. Alenazi
{"title":"Face Forgery Detection via Multi-Scale and Multi-Domain Features Fusion","authors":"Rongrong Gong, Jiahao Chen, Dengyong Zhang, Arun Kumar Sangaiah, Mohammed J. F. Alenazi","doi":"10.1049/ipr2.70131","DOIUrl":"10.1049/ipr2.70131","url":null,"abstract":"<p>Deepfake, as a popular form of visual forgery technique on the Internet, poses a serious threat to individuals' data privacy and security. In consumer electronics, fraudulent schemes leveraging Deepfake technology are widespread, making it urgent to safeguard users' data privacy and security. However, many Deepfake detection methods based on Convolutional Neural Networks (CNNs) struggle to achieve satisfactory performance on mainstream datasets, especially with heavily compressed images. Observing that tampered images leave traces in the frequency domain, which are imperceptible to the naked eye but detectable through spectrum analysis, this study proposes a novel face forgery detection framework integrating spatial and frequency domain features. The framework introduces three innovative modules: the cross-attention fusion module (CAFM), the guided attention module (GAM), and the multi-scale feature fusion module (MSFFM), Specifically, CAFM combines spatial and frequency-domain features through cross-attention to enhance feature interaction. GAM generates attention maps to refine the integration of spatial and frequency features, while MSFFM fuses multi-scale hierarchical features to capture both global and local tampering artifacts. These modules collectively improve the richness and discrimination of the extracted features, contributing to the overall detection performance. The proposed method demonstrates its effectiveness and superiority in forgery detection tasks, achieving a 3.9% average improvement in AUC compared to the state-of-the-art method GocNet [1] on FaceForensics++ (FF++) and WildDeepfake datasets. Extensive experiments further validate the effectiveness of our approach.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70131","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144309011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SEM-YOLO: A Small Target Defect Detection Model for Photovoltaic Modules","authors":"Wang Yun, Yin Wang, Gang Xie, Zhicheng Zhao","doi":"10.1049/ipr2.70134","DOIUrl":"10.1049/ipr2.70134","url":null,"abstract":"<p>Defect detection is key to extending the lifetime of PV modules. However, existing methods still face significant challenges in detecting small and ambiguous targets. To this end, this paper proposes a PV module defect detection model, SEM-YOLO, based on YOLOv8. The model improves the performance through the following improvements: first, the SPD-Conv module is introduced to replace the traditional convolution in the backbone and neck sections to reduce the information loss caused by excessive down-sampling, thus enhancing the detection of small targets. Second, the neck section C2f-EMA module is introduced, in which the efficient multiscale attention module (EMA) enhances feature extraction by redistributing weights and prioritizing relevant features to improve the perception and recognition of small target defects (hot spots). Finally, we add a small target detection layer and increase the MultiSEAM detection header, so that the model can capture and detect small targets more efficiently at the output stage. The experimental results show that the mAP of the improved model reaches 93.8%, among which the mAP of small target defects reaches 83%, which is an improvement of 2.23% and 7.62% compared with YOLOv8. In addition, compared with the mainstream models (RT-DETR, YOLOv9s, YOLOv10n, and YOLOv11), the detection accuracies in terms of overall and small-target defects are significantly improved, which further validates the effectiveness of the model.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70134","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144300338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Lightweight Channel Correlation Invertible Network for Image Denoising","authors":"Fuxian Sui, Hua Wang, Fan Zhang","doi":"10.1049/ipr2.70119","DOIUrl":"10.1049/ipr2.70119","url":null,"abstract":"<p>In recent years, deep learning has made significant progress in image denoising. However, the complexity of advanced methods' systems is also increasing, which will increase the calculation cost and hinder the convenient analysis and comparison of methods. Therefore, a lightweight model based on invertible networks is proposed. The invertible network has great advantages in image denoising. It is lightweight, memory-saving, and information-lossless in backpropagation. To effectively remove the noise and restore a clean image, the high-frequency part of the image is resampled and modeled to remove the impact of noise better. The channel context block is proposed to better focus on useful channels and improve the network's perception of useful information in images while ensuring the complexity and computing cost. At the same time, the residual structure with channel correlation modeling is used to extract the features in the convolutional flow, to effectively retain the details and texture of the image, and learn more details of the spatial features of the image, so as to prevent the blur and distortion of the image in the denoising process. The proposed method allows the model to enjoy lower computational complexity on the premise of ensuring performance.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70119","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144300337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Data-Driven Solution for Large-Scale Open-Pit Mines Excavation Monitoring Based on 3D Point Cloud","authors":"Taiming He, Jiasui Zhang, Lu Yang","doi":"10.1049/ipr2.70130","DOIUrl":"10.1049/ipr2.70130","url":null,"abstract":"<p>We present an adaptive point cloud workflow that withstands heavy environmental noise and the large datasets typical of open-pit mines. The workflow automatically tunes its parameters from the statistics of each input scene, eliminating manual parameter tuning. For instance, it sets the ICP correspondence distance and the clustering threshold without user input. Additionally, our method integrates a coarse-to-fine registration strategy, robust change detection, and precise volumetric estimation based on digital elevation models. Experiments on simulated mining datasets show our method remains robust under heavy noise and misalignment, with volume errors consistently below <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mn>2</mn>\u0000 <mo>%</mo>\u0000 </mrow>\u0000 <annotation>$2%$</annotation>\u0000 </semantics></math>. A field pilot study at a limestone quarry further underscores its practical reliability and operational robustness. This research provides a precise, automated solution for real-time mining monitoring, effectively advancing sustainable and intelligent mining practices. Source code and datasets are publicly available at github.com/deemoe404/volcal_baseline.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70130","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144292491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ramamurthy Dhanyalakshmi, Gabriel Stoian, Daniela Danciulescu, Duraisamy Jude Hemanth
{"title":"A Survey on Face-Swapping Methods for Identity Manipulation in Deepfake Applications","authors":"Ramamurthy Dhanyalakshmi, Gabriel Stoian, Daniela Danciulescu, Duraisamy Jude Hemanth","doi":"10.1049/ipr2.70132","DOIUrl":"10.1049/ipr2.70132","url":null,"abstract":"<p>A face-swapping framework is designed to generate an image or video that merges the pose and characteristics of the input image with the identity from the source image. It has found significant applications in entertainment, privacy protection and digital content creation. However, this process is inherently complex, involving challenges like identity preservation, expression consistency and photorealism. Despite the rapid advancements in face-swapping technology, there has been a noticeable lack of in-depth analysis of the intricate mechanisms and recent developments in this field. This work attempts to bridge that gap by providing an extensive overview of face-swapping methods based on deep learning. Researchers, developers and practitioners interested in learning about the state of face-swapping technology and its possible uses may find this survey to be an invaluable resource. It will provide insights that can inform future research and innovation in this fast-evolving area.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70132","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144273145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}