{"title":"Quaternion-Based Image Restoration via Saturation-Value Total Variation and Pseudo-Norm Regularization","authors":"Zipeng Fu, Xiaoling Ge, Weixian Qian, Xuelian Yu","doi":"10.1049/ipr2.70219","DOIUrl":"10.1049/ipr2.70219","url":null,"abstract":"<p>Color image restoration is a fundamental task in computer vision and image processing, with extensive real-world applications. In practice, color images often suffer from degradations caused by sensor noise, optical blur, compression artifacts, and data loss during the acquisition, transmission, or storage. Unlike grayscale images, color images exhibit high correlations among their RGB channels. Directly extending grayscale restoration methods to color images often leads to issues such as color distortion and structural artifacts. To address these challenges, this paper proposes a novel quaternion-based color image restoration framework. The method integrates low-rank pseudo-norm constraints with saturation-value total variation (SVTV) regularization, effectively enhancing restoration quality in tasks including denoising, deblurring, and inpainting of degraded color images. The proposed algorithm is efficiently solved using the alternating direction method of multipliers (ADMM), and restoration performance is rigorously evaluated through quantitative metrics including peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and S-CIELAB error. Extensive experimental results demonstrate the superior performance of our method compared to existing approaches.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70219","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145146619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PointGS: Point-Wise Feature-Aware Gaussian Splatting for Sparse View Synthesis","authors":"Lintao Xiang, Hongpei Zheng, Yating Huang, Qijun Yang, Hujun Yin","doi":"10.1049/ipr2.70216","DOIUrl":"10.1049/ipr2.70216","url":null,"abstract":"<p>3D Gaussian splatting (3DGS) is an innovative rendering technique that surpasses the neural radiance field (NeRF) in both rendering speed and visual quality by leveraging an explicit 3D scene representation. Existing 3DGS approaches require a large number of calibrated views to generate a consistent and complete scene representation. When input views are limited, 3DGS tends to overfit the training views, leading to noticeable degradation in rendering quality. To address this limitation, we propose a point-wise feature-aware Gaussian splatting framework that enables real-time, high-quality rendering from sparse training views. Specifically, we employ the latest stereo foundation model to estimate accurate camera poses and reconstruct a dense point cloud for Gaussian initialisation. Then we encode the colour attributes of each 3D Gaussian by sampling and aggregating multiscale 2D appearance features from sparse inputs. To enhance point-wise appearance representation, we design a point interaction network based on a self-attention mechanism, allowing each Gaussian point to interact with its nearest neighbours. These enriched features are subsequently decoded into Gaussian parameters through two lightweight multilayer perceptrons for final rendering. Extensive experiments on diverse benchmarks demonstrate that our method significantly outperforms NeRF-based approaches and achieves competitive performance under few-shot settings compared to the state-of-the-art 3DGS methods.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70216","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145146796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"No-Reference Image Quality Assessment via Semantic-Guided Multi-Scale Feature Extraction","authors":"Peng Ji, Wanjing Wang, Zhongyou Lv, Junhua Wu","doi":"10.1049/ipr2.70221","DOIUrl":"10.1049/ipr2.70221","url":null,"abstract":"<p>Image quality assessment is crucial in the development of digital technology. No-reference image quality assessment aims to predict image quality accurately without depending on reference images. In this paper, we propose a semantic-guided multi-scale feature extraction network for no-reference image quality assessment. The network begins with a scale-wise attention module to capture both global and local features. Subsequently, we design a layer-wise feature guidance block that leverages high-level semantic information to guide low-level feature learning for effective feature fusion. Finally, it predicts quality scores through quality regression using the Kolmogorov–Arnold network. Experimental results with 19 existing methods on six public IQA datasets—LIVE, CSIQ, TID2013, KADID-10k, LIVEC and KonIQ-10k—demonstrate that the proposed method can effectively simulate human perceptions of image quality and is highly adaptable to different distortion types.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70221","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145146603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohamed Yamni, Achraf Daoui, Chakir El-Kasri, May Almousa, Ali Abdullah S. AlQahtani, Ahmed A. Abd El-Latif
{"title":"Lightweight Reversible Data Hiding System for Microcontrollers Using Integer Reversible Meixner Transform","authors":"Mohamed Yamni, Achraf Daoui, Chakir El-Kasri, May Almousa, Ali Abdullah S. AlQahtani, Ahmed A. Abd El-Latif","doi":"10.1049/ipr2.70218","DOIUrl":"10.1049/ipr2.70218","url":null,"abstract":"<p>In the realm of secure data communication, reversible data hiding (RDH) has emerged as a promising strategy to ensure both confidentiality and integrity. However, in resource-constrained environments, such as microcontroller platforms, conventional RDH techniques encounter challenges due to factors like minimal memory resources and speed, which restrict the use of microcontrollers for implementing image RDH. Addressing this gap, we introduce a lightweight RDH system tailored for microcontrollers, employing the integer reversible Meixner transform (IRMMT), a variant of the Meixner moment transform optimised for integer operations. Unlike its floating-point version, IRMMT ensures complete preservation of data, even with the use of low finite precision arithmetic, thereby demonstrating its efficacy for lossless applications and its suitability for resource-limited embedded devices. Leveraging IRMMT, we propose a novel RDH algorithm designed to operate efficiently within the limitations of microcontroller resources while preserving image quality and integrity. The algorithm is implemented and evaluated on the Arduino Due board, which features the AT91SAM3X8E 32-bit ARM Cortex-M3 microcontroller, demonstrating the feasibility and effectiveness of the proposed approach in enabling secure wireless data communication. Through theoretical formulation, algorithm design and embedded implementation, this paper contributes to advancing RDH methodologies for resource-limited embedded devices.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70218","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145129256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Deng, XuHong Yu, HongWei Li, ShaoWen Du, Bing He
{"title":"Multi-Task Learning for Chinese Character and Radical Recognition With Dynamic Channel-Spatial Attention and Rotational Positional Encoding","authors":"Wei Deng, XuHong Yu, HongWei Li, ShaoWen Du, Bing He","doi":"10.1049/ipr2.70213","DOIUrl":"10.1049/ipr2.70213","url":null,"abstract":"<p>Optical character recognition (OCR) plays a crucial role in digitizing archives and documents. However, recognizing complex Chinese characters remains challenging owing to their intricate structures and sequential patterns. This study introduces an advanced OCR model that integrates EfficientNetV2 as the backbone within a transformer-based architecture to enhance feature extraction. To address the limitations of traditional adaptive feature selection, we propose a dynamic collaborative channel–spatial attention (DCCSA) module. This module combines channel attention, spatial attention, and channel shuffling to dynamically capture global dependencies and optimize feature representations across both spatial and channel dimensions. Additionally, rotational position encoding (RoPE) is incorporated into the transformer to accurately capture the spatial relationships between characters and radicals, ensuring precise representation of complex hierarchal structures. Further, the model adopts a multitask learning framework that jointly decodes characters and radicals, enabling cross-task optimization and significantly enhancing recognition performance. Experimental results on four benchmark datasets demonstrate that the proposed model outperforms existing methods, achieving significant improvements on both printed and handwritten Chinese text. Moreover, the model shows strong generalization capabilities on challenging scene-text datasets, underscoring its effectiveness in addressing the OCR challenges associated with intricate scripts.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70213","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145101616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aiping Wu, Mingquan Ye, Jiaqi Wang, Ye Shi, Yunfeng Zhou
{"title":"SCUX-Net: Integrating Multi-Scale Features and Channel-Spatial Attention Model for Intracranial Aneurysm Segmentation","authors":"Aiping Wu, Mingquan Ye, Jiaqi Wang, Ye Shi, Yunfeng Zhou","doi":"10.1049/ipr2.70209","DOIUrl":"10.1049/ipr2.70209","url":null,"abstract":"<p>Intracranial aneurysm is a common cerebrovascular condition, due to the small size and complex anatomical location of intracranial aneurysms, it remains a challenging task to accurately segmenting the intracranial aneurysms in computed tomography angiography (CTA) images. To address these challenges, we propose SCUX-Net, a novel lightweight convolutional neural network designed to facilitate the segmentation of intracranial aneurysms. SCUX-Net builds upon the 3D UX-Net by introducing two key innovations: (1) a spatial adaptive feature module, integrated before each 3D UX-Net block, enabling multi-scale feature fusion for long-range information interaction; (2) a convolutional block attention module, applied after each downsampling block to emphasize important features across channel and spatial dimensions, suppressing irrelevant information. Experimental results substantiate the effectiveness of SCUX-Net in segmenting intracranial aneurysms on CTA images, achieving a dice similarity coefficient of 80% on the test set. Notably, SCUX-Net excels in detecting small aneurysms (<span></span><math>\u0000 <semantics>\u0000 <mo>≤</mo>\u0000 <annotation>$le$</annotation>\u0000 </semantics></math>3 mm) and multiple aneurysms, showcasing its potential for clinical application.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70209","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145101730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Convolutional Strategy for Robust Image Dehazing in Diverse Environments","authors":"Hira Khan, Sung Won Kim","doi":"10.1049/ipr2.70207","DOIUrl":"10.1049/ipr2.70207","url":null,"abstract":"<p>Adverse weather conditions such as haze, fog, and smog degrade image visibility, adversely affecting the performance of vision-based systems. Existing dehazing methods often struggle with non-uniform haze distributions, limited detail restoration, and poor generalization across diverse scenes. To overcome these limitations, this paper presents a deep learning-based dehazing framework that jointly restores image clarity and detail. Unlike conventional algorithms that often neglect fine structure recovery, our architecture incorporates four specialized sub-modules: (i) a noise attention module for enhancing noise suppression and feature preservation; (ii) an adaptive ConvNet module; (iii) a feature extraction module for capturing salient image features; and (iv) a detail refinement module to enhance spatial fidelity. The architecture is trained in an end-to-end manner to restore both structural integrity and colour consistency under challenging conditions. Extensive experiments conducted on synthetic and real-world datasets, including indoor, outdoor, underwater, night-time, and remote sensing scenarios, demonstrate superior generalization capability. In the SOTS indoor dataset, our method achieves a PSNR of 28.44 dB and an SSIM of 0.967, outperforming several state-of-the-art methods. Evaluations using additional metrics such as CIEDE2000 and MSE confirm the effectiveness of the proposed method in handling dense and heterogeneous haze while preserving fine textures and visual fidelity.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70207","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145101729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aqeel M. Humadi, Mehdi Sadeghzadeh, Hameed A. Younis, Mahdi Mosleh
{"title":"Adaptive Two-Tier Deep Learning for Content-Based Image Retrieval and Classification With Dynamic Similarity Fusion","authors":"Aqeel M. Humadi, Mehdi Sadeghzadeh, Hameed A. Younis, Mahdi Mosleh","doi":"10.1049/ipr2.70192","DOIUrl":"10.1049/ipr2.70192","url":null,"abstract":"<p>Content-Based Image Retrieval (CBIR) systems have difficulties with computing efficiency, illumination robustness and noise sensitivity. Traditional methods rely on handcrafted features or monolithic deep learning architectures, which either lack adaptability to diverse image domains or suffer from high computational complexity. To bridge this gap, a unique two-tier deep learning system is presented in this research to overcome these drawbacks. First, a supervised neural network (SNN) reduces dimensionality and improves interpretability by converting HSV colour space into semantic 2D colour labels through pixel-level classification. This addresses the inefficiency of processing raw RGB data while preserving illumination-invariant colour semantics. Second, a Convolutional Neural Network (CNN) greatly increases computing efficiency by processing these labels rather than raw images. By operating on compressed 2D representations, the system achieves faster inference compared to standard 3D CNN pipelines. The framework presents Variable Weight Overall Similarity (VWOS), a versatile similarity metric that combines semantic (softmax) and structural (MaxPool3) elements with dynamically predicted weights using a neural network to automatically optimise retrieval performance based on image content. This adaptive fusion resolves the limitations of fixed-weight similarity measures in handling heterogeneous query types. The system has achieved a performance with precision@10 scores of 0.9-1.0 and classification accuracies of 0.85-0.98 when tested on the PH<sup>2</sup>, Oxford Flowers, Corel-1k, Caltech-101 and Kvasir datasets. Notably, it outperforms current handcrafted, deep learning and hybrid approaches, achieving 1.0 precision@10 on four datasets and 0.96 accuracy on medical Kvasir images. Quantitative comparisons show 9%–14% higher precision than handcrafted methods, 3%–35% improvement over deep learning baselines, and 12% better than hybrid systems. This approach is especially promising for applications involving multimedia retrieval and medical imaging, where interpretability and accuracy are crucial.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70192","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145058071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimization of Module Transferability in Single Image Super-Resolution: Universality Assessment and Cycle Residual Blocks","authors":"Haotong Cheng, Zhiqi Zhang, Hao Li, Xinshang Zhang","doi":"10.1049/ipr2.70206","DOIUrl":"10.1049/ipr2.70206","url":null,"abstract":"<p>Deep learning has substantially advanced the single image super-resolution (SISR). However, existing researches have predominantly focused on raw performance gains, with little attention paid to quantifying the transferability of architectural components. In this paper, we introduce the concept of “Universality” and its associated definitions which extend the traditional notion of “Generalization” to encompass the modules' ease of transferability. Then we propose the universality assessment equation (UAE), a metric which quantifies how readily a given module could be transplanted across models and reveals the combined influence of multiple existing metrics on transferability. Guided by the UAE results of standard residual blocks and other plug-and-play modules, we further design two optimized modules, cycle residual block (CRB) and depth-wise cycle residual block (DCRB). Through comprehensive experiments on natural-scene benchmarks, remote-sensing datasets and other low-level tasks, we demonstrate that networks embedded with the proposed plug-and-play modules outperform several state-of-the-arts, reaching a PSNR enhancement of up to 0.83 dB or enabling a 71.3% reduction in parameters with negligible loss in reconstruction fidelity. Similar optimization approaches could be applied to a broader range of basic modules, offering a new paradigm for the design of plug-and-play modules.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70206","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145038394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Kinyanjui Njoroge, Edwin Juma Omol, Vincent Omollo Nyangaresi
{"title":"Deep Learning and IoT Fusion for Crop Health Monitoring: A High-Accuracy, Edge-Optimised Model for Smart Farming","authors":"Thomas Kinyanjui Njoroge, Edwin Juma Omol, Vincent Omollo Nyangaresi","doi":"10.1049/ipr2.70208","DOIUrl":"10.1049/ipr2.70208","url":null,"abstract":"<p>Crop diseases and adverse field conditions threaten global food security, particularly in resource-limited regions. Current deep-learning models for disease detection suffer from insufficient accuracy, high prediction instability under field noise, and a lack of integration with environmental context. To address these limitations, we present a hybrid deep learning architecture combining EfficientNetV2, MobileNetV2, and Vision Transformers, augmented with attention mechanisms and multiscale feature fusion. Optimised for edge deployment via TensorFlow Lite and integrated with IoT sensors for real-time soil and field monitoring, the model achieved state-of-the-art performance with 99.2% accuracy, 0.993 precision, 0.993 recall, and a near-perfect AUC of 0.999998, outperforming benchmarks like DenseNet50 (88.4%) and ShuffleNet (95.8%). Training on 76 classes (22 diseases) demonstrated rapid convergence and robustness, with validation accuracy reaching 98.7% and minimal overfitting. Statistical validation confirmed superior stability, with 69% lower prediction variance (0.000010) than DenseNet50 (0.000035), ensuring reliable performance under real-world noise. Bayesian testing showed a 100% probability of superiority over DenseNet50 and 85.1% over ShuffleNet, while field trials on 249 real-world images achieved 97.97% accuracy, highlighting strong generalisation. IoT integration reduced false diagnoses by 92% through environmental correlation, and edge optimisation enabled real-time inference via a 30.4 MB mobile application (0.094-second latency). This work advances precision agriculture through a scalable, cloud-independent framework that unifies hybrid deep learning with edge-compatible IoT sensing. By addressing critical gaps in accuracy, stability, and contextual awareness, the system enhances crop health management in low-resource settings, offering a statistically validated tool for sustainable farming practices.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70208","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145038070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}