{"title":"Tensor Truncated Schatten-p Norm Approximation Tensor Completion Algorithm","authors":"Jianwei Liu, Liangfu Lu, Ping Wang, Haipeng Liu, Yuanchen Huang, Yunliang Zang","doi":"10.1049/ipr2.70171","DOIUrl":"https://doi.org/10.1049/ipr2.70171","url":null,"abstract":"<p>Image data is often degraded during transmission due to hardware limitations or human error, which may hinder subsequent image analysis tasks. Therefore, research on image restoration has significant practical value. Traditional matrix-based algorithms struggle with high-dimensional data, often failing to preserve spatial structures and risking overfitting. In this paper, we investigate tensor recovery problems under the tensor singular value decomposition framework. We introduce a non-convex surrogate for the tensor rank—the tensor truncated Schatten-<span></span><math>\u0000 <semantics>\u0000 <mi>p</mi>\u0000 <annotation>$p$</annotation>\u0000 </semantics></math> norm—and propose two recovery models based on this theory: a tensor completion model and a tensor robust principal component analysis model. Efficient solutions based on the alternating direction method of multipliers are developed for both models. Moreover, we provide a thorough analysis of the computational complexity and convergence behavior of our algorithms. At last, extensive experiments on synthetic data, color images, video sequences, multispectral images, and medical images demonstrate the effectiveness and robustness of the proposed methods.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70171","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144751523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meriem Touhami, Zaka Ur Rehman, Md Jahid Hasan, Mohammad Faizal Ahmad Fauzi, Sarina Binti Mansor
{"title":"Histopathology Image Enhancement Using Multi-Resolution Deep Learning Techniques","authors":"Meriem Touhami, Zaka Ur Rehman, Md Jahid Hasan, Mohammad Faizal Ahmad Fauzi, Sarina Binti Mansor","doi":"10.1049/ipr2.70166","DOIUrl":"https://doi.org/10.1049/ipr2.70166","url":null,"abstract":"<p>Accurate analysis of histopathology images is essential for disease diagnosis and treatment planning. However, the quality of digital pathology slides is often limited by scanner resolution, which can compromise diagnostic precision and patient care. To address this challenge, we conducted a comparative study evaluating four state of the art image enhancement methods: real enhanced super resolution generative adversarial network (Real-ESRGAN), SwinIR, multi scale image restoration network v2 (MIRNet-v2) and super resolution CNN (SRCNN). Our assessment focused on both quantitative metrics peak signal to noise ratio (PSNR) and structural similarity index (SSIM) and qualitative visual analysis to evaluate detail preservation. The experimental results revealed that SwinIR achieved the best quantitative performance among all evaluated methods, attaining the highest PSNR (35.81) and SSIM (0.95) for lung images from the LC2500 dataset at a 2 <span></span><math>\u0000 <semantics>\u0000 <mo>×</mo>\u0000 <annotation>$times$</annotation>\u0000 </semantics></math> upscaling factor. In contrast, real-ESRGAN excelled in perceptual quality, preserving finer image details more effectively, though it recorded slightly lower numerical scores (PSNR: 33.53, SSIM: 0.92) on the same dataset. These outcomes highlight essential trade off between perceptual fidelity and reconstruction quality, indicating that the optimal choice of enhancement method may vary depending on clinical or diagnostic priorities. The MIRNetv2 method delivered reasonable performance but ranked below both real-ESRGAN and SwinIR. Specifically, it achieved PSNR/SSIM scores of 30.67/0.94 on PR-IHC patches, 32.90/0.95 on lung images, and 31.87/0.95 on colon images, while scoring 29.11 for PR-IHC images in a separate evaluation. SRCNN demonstrated a balanced performance across datasets, achieving PSNR/SSIM values of 31.45/0.88 for lung images, 30.76/0.87 for PR-IHC patches, 32.62/0.93 for colon images, and 33.76/0.91 for PR-IHC. These findings underscore the real ESRGAN as the most effective method for improving the resolution and quality of histopathology images, supporting its potential integration into digital pathology workflows to enhance diagnostic accuracy and patient outcomes.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70166","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144740564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuo Zhang, Yiqi Huang, Yueping Wang, Wei Jiang, Bin Wang
{"title":"Secure Color Traffic Image Transmission Through Enhanced Visual Encryption","authors":"Shuo Zhang, Yiqi Huang, Yueping Wang, Wei Jiang, Bin Wang","doi":"10.1049/ipr2.70172","DOIUrl":"https://doi.org/10.1049/ipr2.70172","url":null,"abstract":"<p>With the rapid advancement of intelligent transportation systems (ITS), secure processing of color traffic images has become critical for traffic management, safety monitoring, and flow analysis. However, ITS are increasingly susceptible to malicious attacks and data tampering. To address this, we propose an advanced visual encryption method specifically for color traffic images. Our approach employs discrete wavelet transform (DWT) to represent images sparsely, followed by dynamic three-dimensional spiral disruption for encryption. Chaotic systems then compress the encrypted data, embedding it into a carrier image via least significant bit (LSB) techniques. Experimental results demonstrate robust visual security, achieving a peak signal-to-noise ratio (PSNR) above 42 dB, strong resilience against common attacks, and a substantial a key space of 2<sup>512</sup>×10<sup>60</sup>, effectively resisting brute-force attacks. Neighboring pixel correlation coefficients drop below 0.15, compared to original images (>0.87), and under shear resistance attack with 128×128 data loss, PSNR remains above 23 dB. Additionally, employing P-tensor product compressed sensing significantly reduces measurement matrix dimensionality, enhancing transmission efficiency. This method offers a viable solution for secure storage and transmission in modern ITS, significantly bolstering privacy and system robustness. The implementation is available at: https://gitee.com/zhangshuo-aly/image-encryption.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70172","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144740275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SUNeXt: Lightweight Medical Image Segmentation Network Based on Grouped Feature Fusion and Shifted Large Kernel Convolution","authors":"Cong Chen, Xiaoxin Guo, Hangyuan Cheng, Guangqi Yang, Hongliang Dong","doi":"10.1049/ipr2.70168","DOIUrl":"https://doi.org/10.1049/ipr2.70168","url":null,"abstract":"<p>To solve efficient image segmentation in practical medical applications in resource-constrained point-of-care environments, the lightweight medical image segmentation network is proposed based on grouped feature fusion and large kernel convolution, which introduces a U-shaped, convolution-based architecture that significantly reduces parameters and computational cost. The proposed model combines shifted large kernel convolution with grouped feature fusion technique in a lightweight and attention-free way, which is specifically designed to fuse features to capture global context. Meanwhile, the grouped multi-scale feature fusion module is proposed to achieve effective cross-layer connectivity and efficient fusion of multi-scale features by grouping deep and shallow features and subsequently applying a lightweight grouped large kernel convolution. The extensive experiments on multiple datasets verify that our model outperforms current popular models in image segmentation with lower parameter quantity and computational cost, and achieves industry-leading performance with low resource consumption.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70168","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144725650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xin Wang, Sihao Huang, Gao Chen, Dejiang Wang, Jingwen Yan
{"title":"CSFRT: Cross-Spectral Feature-Refined Transformer for Hyperspectral Image Classification","authors":"Xin Wang, Sihao Huang, Gao Chen, Dejiang Wang, Jingwen Yan","doi":"10.1049/ipr2.70161","DOIUrl":"https://doi.org/10.1049/ipr2.70161","url":null,"abstract":"<p>Hyperspectral images (HSIs) contain rich spectral information, which enables accurate target classification. However, HSIs also contain a significant amount of redundant spectral information, and the spectra of different objects often overlap. This overlap in spectral features increases the similarity between different objects' spectra, posing a challenge for classification. Thus, suppressing redundant spectral information and refining key features are critical tasks. To address these challenges, we propose a cross-spectral feature-refined transformer (CSFRT) based on the vision transformer (ViT). In the proposed CSFRT, a two-branch gated-refined feed-forward network (TBGFN) module is introduced to suppress redundant information and enhance key spectral features by utilizing branches with and without gated mechanisms. Additionally, a cross-layer spectral feature-fusion (CLSF) module is proposed to integrate feature information and facilitate information complementarity across different encoder blocks. Extensive experiments are conducted on five different HSI datasets to verify the classification performance of the proposed CSFRT, demonstrating the effectiveness of the architecture.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70161","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144716996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Flow-Based SR Optimisation Method Based on Dual-Dimensional Feature Co-Enhancement Module","authors":"Zhengjie Wei, Xiaomin Yang","doi":"10.1049/ipr2.70159","DOIUrl":"https://doi.org/10.1049/ipr2.70159","url":null,"abstract":"<p>The super-resolution (SR) task aims to reconstruct high-resolution content from low-resolution images, and its core challenge is to solve the problem of ill-posedness while balancing the fidelity and perceptual quality of the generated images. Although flow-based SR models have made significant progress in modelling high-resolution image distributions, their generated images still have shortcomings in terms of detail representation. To address the issues above, this paper proposes an optimised method that incorporates a conditionally learnt prior (latent module). Specifically, a Dual-Dimensional Feature Co-Enhancement (DFCE) module is developed to perform joint optimisation on the channel and spatial dimensions of the features. The experimental results on the public datasets show that this framework effectively improves the generation quality of images with almost no increase in the amount of computation. Furthermore, this framework can be seamlessly integrated into fixed-scale and arbitrary-scale streaming models without the need to modify their pre-training weights or architecture, which provides efficient and flexible solutions for practical applications.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70159","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144705148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jikui Wang, Chengzhu Ji, Feifei Liu, Baocheng Yao, Qingsheng Shang, Feiping Nie
{"title":"A Band Selection Approach Based on a Mass-Based Metric and Shared Nearest-Neighbours for Hyperspectral Images","authors":"Jikui Wang, Chengzhu Ji, Feifei Liu, Baocheng Yao, Qingsheng Shang, Feiping Nie","doi":"10.1049/ipr2.70165","DOIUrl":"https://doi.org/10.1049/ipr2.70165","url":null,"abstract":"<p>Band selection in hyperspectral imaging is a burgeoning research area whose aim is to select a small number of bands in order to reduce data redundancy and noise bands. The existing ranking-based methods face two challenges: (1) The density calculation using <span></span><math>\u0000 <semantics>\u0000 <mi>k</mi>\u0000 <annotation>${mathrm{k}}$</annotation>\u0000 </semantics></math> nearest neighbours only considers distances between bands, ignoring shared neighbours. Thus, it fails to reflect the local distribution of bands. (2) The high dimensionality of the bands limits the effectiveness of the Euclidean distance-based metric in accurately capturing their similarity. To address the issues, we've proposed an innovative approach for selecting bands, grounded in a mass-based metric and shared nearest neighbours called MBSNN. Initially, we leverage a mass-based metric computation technique to supplant the conventional distance metric between disparate bands. This substitution mitigates the distortions that high-dimensional data can inflict on distance calculations. Subsequently, the natural nearest neighbour method is combined to calculate the local density of the band, reflecting its local distribution characteristics. Finally, an information entropy and peak synergy band selection technique is constructed. To substantiate the merits of our proposed approach, we executed experiments utilising support vector machines across four benchmark datasets. The results of these experiments affirm the effectiveness of our band selection approach.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70165","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144681085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ellipse Fitting of Planar Points With Outliers Using Random Samples Filtered by Fitting Qualities","authors":"Qi Zeng, Xin Li, Siyu Guo","doi":"10.1049/ipr2.70167","DOIUrl":"https://doi.org/10.1049/ipr2.70167","url":null,"abstract":"<p>Ellipse fitting is a traditional approach to construct elliptical models from points. Outliers can significantly distort a fittied ellipse from the actual model. A novel ellipse fitting algorithm is proposed to be applied to planar points with outliers. The fitting qualities of candidate ellipses generated by five-point random sampling are evaluated, and the median curve of the best fittings yields the final result through a classic fitting algorithm. A fitting error metric is introduced. The proposed algorithm achieved median fitting errors of 0.067 on a synthetic dataset and 0.057 on an image dataset, respectively, both the best among the algorithms compared. The execution speed of the novel algorithm is on average 0.091 s on the synthetic dataset and 0.087 s on the image dataset. The algorithm is advantageous also for use due to the comprehensibility and insensitivity of the algorithmic parameters.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70167","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144681084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Speech2Face3D: A Two-Stage Transfer-Learning Framework for Speech-Driven 3D Facial Animation","authors":"Liming Pang, Zhi Zeng, Yahui Li, Guixuan Zhang, Shuwu Zhang","doi":"10.1049/ipr2.70155","DOIUrl":"https://doi.org/10.1049/ipr2.70155","url":null,"abstract":"<p>High-fidelity, speech-driven 3D facial animation is crucial for immersive applications and virtual avatars. Nevertheless, advancement is impeded by two principal challenges: (1) a lack of high-quality 3D data, and (2) inadequate modelling of the multi-scale characteristics of speech signals. In this paper, we present Speech2Face3D, a novel two-stage transfer-learning framework that pretrains on large-scale pseudo-3D facial data derived from 2D videos and subsequently finetunes on smaller yet high-fidelity 3D datasets. This design leverages the richness of easily accessible 2D resources while mitigating reconstruction noise through a simple temporal smoothing step. Our approach further introduces a Multi-Scale Hierarchical Audio Encoder to capture subtle phoneme transitions, mid-range prosody, and longer-range emotional cues. Extensive experiments on public 3D benchmarks demonstrate that our method achieves state-of-the-art performance on lip synchronization, expression fidelity, and temporal coherence metrics. Qualitative user evaluations validate these quantitative improvements. Speech2Face3D is a robust and scalable framework for utilizing extensive 2D data to generate precise and realistic 3D facial animations only based on speech.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70155","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144681083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianyu Wang, Boxuan Ma, Xinrui Zhao, Chao Mou, Jiahua Fan
{"title":"Pose-Guided Re-Identification of Amur Tigers Under Wild Environmental Constraints","authors":"Tianyu Wang, Boxuan Ma, Xinrui Zhao, Chao Mou, Jiahua Fan","doi":"10.1049/ipr2.70160","DOIUrl":"https://doi.org/10.1049/ipr2.70160","url":null,"abstract":"<p>The conservation of endangered species is contingent upon accurate and efficient wildlife monitoring, which is essential for informed decision-making and effective preservation strategies. With the global population of Amur tigers (Panthera tigris altaica) falling below 600, innovative conservation strategies are critically needed. Traditional monitoring methods have fallen short in accuracy and efficiency, leading to a shift towards leveraging big data and artificial intelligence for effective wildlife surveillance. Existing re-identification techniques struggle with natural habitat challenges like occlusions, changing poses, varying light, and limited data. To overcome these issues, we propose the pose-guided dual branch re-identification network (PDBRNet). Our approach integrates pose estimation to guide feature disentanglement and alignment, crucial for accurate re-identification, while an image preprocessing method considering illumination factors mitigates lighting variations' impact on accuracy. Through validation on the occluded and illumination-varying amur tiger (OIAT) dataset, PDBRNet demonstrates exceptional performance. Specifically, in single-camera scenarios, PDBRNet achieves an outstanding mean average precision (mAP) of 79.4, surpassing the performance of PGCFL (51.6) and PPGNet (69.7). Moreover, in cross-camera scenarios, PDBRNet maintains its superiority with a remarkable mAP of 54.0, along with Rank-1 and Rank-5 scores of 97.8 and 98.9, respectively, showcasing its robustness in real-world surveillance applications. The introduction of PDBRNet significantly enhances re-identification accuracy and holds promise for addressing complexities in field environments, contributing significantly to wildlife conservation efforts.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70160","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144672634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}