IET Image Processing最新文献

筛选
英文 中文
Quaternion-Based Image Restoration via Saturation-Value Total Variation and Pseudo-Norm Regularization 基于饱和值总变差和伪范数正则化的四元数图像恢复
IF 2.2 4区 计算机科学
IET Image Processing Pub Date : 2025-09-27 DOI: 10.1049/ipr2.70219
Zipeng Fu, Xiaoling Ge, Weixian Qian, Xuelian Yu
{"title":"Quaternion-Based Image Restoration via Saturation-Value Total Variation and Pseudo-Norm Regularization","authors":"Zipeng Fu,&nbsp;Xiaoling Ge,&nbsp;Weixian Qian,&nbsp;Xuelian Yu","doi":"10.1049/ipr2.70219","DOIUrl":"10.1049/ipr2.70219","url":null,"abstract":"<p>Color image restoration is a fundamental task in computer vision and image processing, with extensive real-world applications. In practice, color images often suffer from degradations caused by sensor noise, optical blur, compression artifacts, and data loss during the acquisition, transmission, or storage. Unlike grayscale images, color images exhibit high correlations among their RGB channels. Directly extending grayscale restoration methods to color images often leads to issues such as color distortion and structural artifacts. To address these challenges, this paper proposes a novel quaternion-based color image restoration framework. The method integrates low-rank pseudo-norm constraints with saturation-value total variation (SVTV) regularization, effectively enhancing restoration quality in tasks including denoising, deblurring, and inpainting of degraded color images. The proposed algorithm is efficiently solved using the alternating direction method of multipliers (ADMM), and restoration performance is rigorously evaluated through quantitative metrics including peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and S-CIELAB error. Extensive experimental results demonstrate the superior performance of our method compared to existing approaches.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70219","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145146619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PointGS: Point-Wise Feature-Aware Gaussian Splatting for Sparse View Synthesis PointGS:用于稀疏视图合成的逐点特征感知高斯飞溅
IF 2.2 4区 计算机科学
IET Image Processing Pub Date : 2025-09-26 DOI: 10.1049/ipr2.70216
Lintao Xiang, Hongpei Zheng, Yating Huang, Qijun Yang, Hujun Yin
{"title":"PointGS: Point-Wise Feature-Aware Gaussian Splatting for Sparse View Synthesis","authors":"Lintao Xiang,&nbsp;Hongpei Zheng,&nbsp;Yating Huang,&nbsp;Qijun Yang,&nbsp;Hujun Yin","doi":"10.1049/ipr2.70216","DOIUrl":"10.1049/ipr2.70216","url":null,"abstract":"<p>3D Gaussian splatting (3DGS) is an innovative rendering technique that surpasses the neural radiance field (NeRF) in both rendering speed and visual quality by leveraging an explicit 3D scene representation. Existing 3DGS approaches require a large number of calibrated views to generate a consistent and complete scene representation. When input views are limited, 3DGS tends to overfit the training views, leading to noticeable degradation in rendering quality. To address this limitation, we propose a point-wise feature-aware Gaussian splatting framework that enables real-time, high-quality rendering from sparse training views. Specifically, we employ the latest stereo foundation model to estimate accurate camera poses and reconstruct a dense point cloud for Gaussian initialisation. Then we encode the colour attributes of each 3D Gaussian by sampling and aggregating multiscale 2D appearance features from sparse inputs. To enhance point-wise appearance representation, we design a point interaction network based on a self-attention mechanism, allowing each Gaussian point to interact with its nearest neighbours. These enriched features are subsequently decoded into Gaussian parameters through two lightweight multilayer perceptrons for final rendering. Extensive experiments on diverse benchmarks demonstrate that our method significantly outperforms NeRF-based approaches and achieves competitive performance under few-shot settings compared to the state-of-the-art 3DGS methods.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70216","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145146796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
No-Reference Image Quality Assessment via Semantic-Guided Multi-Scale Feature Extraction 基于语义引导的多尺度特征提取的无参考图像质量评估
IF 2.2 4区 计算机科学
IET Image Processing Pub Date : 2025-09-26 DOI: 10.1049/ipr2.70221
Peng Ji, Wanjing Wang, Zhongyou Lv, Junhua Wu
{"title":"No-Reference Image Quality Assessment via Semantic-Guided Multi-Scale Feature Extraction","authors":"Peng Ji,&nbsp;Wanjing Wang,&nbsp;Zhongyou Lv,&nbsp;Junhua Wu","doi":"10.1049/ipr2.70221","DOIUrl":"10.1049/ipr2.70221","url":null,"abstract":"<p>Image quality assessment is crucial in the development of digital technology. No-reference image quality assessment aims to predict image quality accurately without depending on reference images. In this paper, we propose a semantic-guided multi-scale feature extraction network for no-reference image quality assessment. The network begins with a scale-wise attention module to capture both global and local features. Subsequently, we design a layer-wise feature guidance block that leverages high-level semantic information to guide low-level feature learning for effective feature fusion. Finally, it predicts quality scores through quality regression using the Kolmogorov–Arnold network. Experimental results with 19 existing methods on six public IQA datasets—LIVE, CSIQ, TID2013, KADID-10k, LIVEC and KonIQ-10k—demonstrate that the proposed method can effectively simulate human perceptions of image quality and is highly adaptable to different distortion types.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70221","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145146603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lightweight Reversible Data Hiding System for Microcontrollers Using Integer Reversible Meixner Transform 基于整数可逆Meixner变换的微控制器轻量级可逆数据隐藏系统
IF 2.2 4区 计算机科学
IET Image Processing Pub Date : 2025-09-23 DOI: 10.1049/ipr2.70218
Mohamed Yamni, Achraf Daoui, Chakir El-Kasri, May Almousa, Ali Abdullah S. AlQahtani, Ahmed A. Abd El-Latif
{"title":"Lightweight Reversible Data Hiding System for Microcontrollers Using Integer Reversible Meixner Transform","authors":"Mohamed Yamni,&nbsp;Achraf Daoui,&nbsp;Chakir El-Kasri,&nbsp;May Almousa,&nbsp;Ali Abdullah S. AlQahtani,&nbsp;Ahmed A. Abd El-Latif","doi":"10.1049/ipr2.70218","DOIUrl":"10.1049/ipr2.70218","url":null,"abstract":"<p>In the realm of secure data communication, reversible data hiding (RDH) has emerged as a promising strategy to ensure both confidentiality and integrity. However, in resource-constrained environments, such as microcontroller platforms, conventional RDH techniques encounter challenges due to factors like minimal memory resources and speed, which restrict the use of microcontrollers for implementing image RDH. Addressing this gap, we introduce a lightweight RDH system tailored for microcontrollers, employing the integer reversible Meixner transform (IRMMT), a variant of the Meixner moment transform optimised for integer operations. Unlike its floating-point version, IRMMT ensures complete preservation of data, even with the use of low finite precision arithmetic, thereby demonstrating its efficacy for lossless applications and its suitability for resource-limited embedded devices. Leveraging IRMMT, we propose a novel RDH algorithm designed to operate efficiently within the limitations of microcontroller resources while preserving image quality and integrity. The algorithm is implemented and evaluated on the Arduino Due board, which features the AT91SAM3X8E 32-bit ARM Cortex-M3 microcontroller, demonstrating the feasibility and effectiveness of the proposed approach in enabling secure wireless data communication. Through theoretical formulation, algorithm design and embedded implementation, this paper contributes to advancing RDH methodologies for resource-limited embedded devices.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70218","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145129256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Task Learning for Chinese Character and Radical Recognition With Dynamic Channel-Spatial Attention and Rotational Positional Encoding 基于动态通道-空间注意和旋转位置编码的汉字词根识别多任务学习
IF 2.2 4区 计算机科学
IET Image Processing Pub Date : 2025-09-18 DOI: 10.1049/ipr2.70213
Wei Deng, XuHong Yu, HongWei Li, ShaoWen Du, Bing He
{"title":"Multi-Task Learning for Chinese Character and Radical Recognition With Dynamic Channel-Spatial Attention and Rotational Positional Encoding","authors":"Wei Deng,&nbsp;XuHong Yu,&nbsp;HongWei Li,&nbsp;ShaoWen Du,&nbsp;Bing He","doi":"10.1049/ipr2.70213","DOIUrl":"10.1049/ipr2.70213","url":null,"abstract":"<p>Optical character recognition (OCR) plays a crucial role in digitizing archives and documents. However, recognizing complex Chinese characters remains challenging owing to their intricate structures and sequential patterns. This study introduces an advanced OCR model that integrates EfficientNetV2 as the backbone within a transformer-based architecture to enhance feature extraction. To address the limitations of traditional adaptive feature selection, we propose a dynamic collaborative channel–spatial attention (DCCSA) module. This module combines channel attention, spatial attention, and channel shuffling to dynamically capture global dependencies and optimize feature representations across both spatial and channel dimensions. Additionally, rotational position encoding (RoPE) is incorporated into the transformer to accurately capture the spatial relationships between characters and radicals, ensuring precise representation of complex hierarchal structures. Further, the model adopts a multitask learning framework that jointly decodes characters and radicals, enabling cross-task optimization and significantly enhancing recognition performance. Experimental results on four benchmark datasets demonstrate that the proposed model outperforms existing methods, achieving significant improvements on both printed and handwritten Chinese text. Moreover, the model shows strong generalization capabilities on challenging scene-text datasets, underscoring its effectiveness in addressing the OCR challenges associated with intricate scripts.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70213","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145101616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SCUX-Net: Integrating Multi-Scale Features and Channel-Spatial Attention Model for Intracranial Aneurysm Segmentation 基于多尺度特征和通道-空间注意模型的颅内动脉瘤分割
IF 2.2 4区 计算机科学
IET Image Processing Pub Date : 2025-09-16 DOI: 10.1049/ipr2.70209
Aiping Wu, Mingquan Ye, Jiaqi Wang, Ye Shi, Yunfeng Zhou
{"title":"SCUX-Net: Integrating Multi-Scale Features and Channel-Spatial Attention Model for Intracranial Aneurysm Segmentation","authors":"Aiping Wu,&nbsp;Mingquan Ye,&nbsp;Jiaqi Wang,&nbsp;Ye Shi,&nbsp;Yunfeng Zhou","doi":"10.1049/ipr2.70209","DOIUrl":"10.1049/ipr2.70209","url":null,"abstract":"<p>Intracranial aneurysm is a common cerebrovascular condition, due to the small size and complex anatomical location of intracranial aneurysms, it remains a challenging task to accurately segmenting the intracranial aneurysms in computed tomography angiography (CTA) images. To address these challenges, we propose SCUX-Net, a novel lightweight convolutional neural network designed to facilitate the segmentation of intracranial aneurysms. SCUX-Net builds upon the 3D UX-Net by introducing two key innovations: (1) a spatial adaptive feature module, integrated before each 3D UX-Net block, enabling multi-scale feature fusion for long-range information interaction; (2) a convolutional block attention module, applied after each downsampling block to emphasize important features across channel and spatial dimensions, suppressing irrelevant information. Experimental results substantiate the effectiveness of SCUX-Net in segmenting intracranial aneurysms on CTA images, achieving a dice similarity coefficient of 80% on the test set. Notably, SCUX-Net excels in detecting small aneurysms (<span></span><math>\u0000 <semantics>\u0000 <mo>≤</mo>\u0000 <annotation>$le$</annotation>\u0000 </semantics></math>3 mm) and multiple aneurysms, showcasing its potential for clinical application.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70209","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145101730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Convolutional Strategy for Robust Image Dehazing in Diverse Environments 不同环境下鲁棒图像去雾的自适应卷积策略
IF 2.2 4区 计算机科学
IET Image Processing Pub Date : 2025-09-16 DOI: 10.1049/ipr2.70207
Hira Khan, Sung Won Kim
{"title":"Adaptive Convolutional Strategy for Robust Image Dehazing in Diverse Environments","authors":"Hira Khan,&nbsp;Sung Won Kim","doi":"10.1049/ipr2.70207","DOIUrl":"10.1049/ipr2.70207","url":null,"abstract":"<p>Adverse weather conditions such as haze, fog, and smog degrade image visibility, adversely affecting the performance of vision-based systems. Existing dehazing methods often struggle with non-uniform haze distributions, limited detail restoration, and poor generalization across diverse scenes. To overcome these limitations, this paper presents a deep learning-based dehazing framework that jointly restores image clarity and detail. Unlike conventional algorithms that often neglect fine structure recovery, our architecture incorporates four specialized sub-modules: (i) a noise attention module for enhancing noise suppression and feature preservation; (ii) an adaptive ConvNet module; (iii) a feature extraction module for capturing salient image features; and (iv) a detail refinement module to enhance spatial fidelity. The architecture is trained in an end-to-end manner to restore both structural integrity and colour consistency under challenging conditions. Extensive experiments conducted on synthetic and real-world datasets, including indoor, outdoor, underwater, night-time, and remote sensing scenarios, demonstrate superior generalization capability. In the SOTS indoor dataset, our method achieves a PSNR of 28.44 dB and an SSIM of 0.967, outperforming several state-of-the-art methods. Evaluations using additional metrics such as CIEDE2000 and MSE confirm the effectiveness of the proposed method in handling dense and heterogeneous haze while preserving fine textures and visual fidelity.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70207","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145101729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Two-Tier Deep Learning for Content-Based Image Retrieval and Classification With Dynamic Similarity Fusion 基于动态相似度融合的自适应两层深度学习图像检索与分类
IF 2.2 4区 计算机科学
IET Image Processing Pub Date : 2025-09-15 DOI: 10.1049/ipr2.70192
Aqeel M. Humadi, Mehdi Sadeghzadeh, Hameed A. Younis, Mahdi Mosleh
{"title":"Adaptive Two-Tier Deep Learning for Content-Based Image Retrieval and Classification With Dynamic Similarity Fusion","authors":"Aqeel M. Humadi,&nbsp;Mehdi Sadeghzadeh,&nbsp;Hameed A. Younis,&nbsp;Mahdi Mosleh","doi":"10.1049/ipr2.70192","DOIUrl":"10.1049/ipr2.70192","url":null,"abstract":"<p>Content-Based Image Retrieval (CBIR) systems have difficulties with computing efficiency, illumination robustness and noise sensitivity. Traditional methods rely on handcrafted features or monolithic deep learning architectures, which either lack adaptability to diverse image domains or suffer from high computational complexity. To bridge this gap, a unique two-tier deep learning system is presented in this research to overcome these drawbacks. First, a supervised neural network (SNN) reduces dimensionality and improves interpretability by converting HSV colour space into semantic 2D colour labels through pixel-level classification. This addresses the inefficiency of processing raw RGB data while preserving illumination-invariant colour semantics. Second, a Convolutional Neural Network (CNN) greatly increases computing efficiency by processing these labels rather than raw images. By operating on compressed 2D representations, the system achieves faster inference compared to standard 3D CNN pipelines. The framework presents Variable Weight Overall Similarity (VWOS), a versatile similarity metric that combines semantic (softmax) and structural (MaxPool3) elements with dynamically predicted weights using a neural network to automatically optimise retrieval performance based on image content. This adaptive fusion resolves the limitations of fixed-weight similarity measures in handling heterogeneous query types. The system has achieved a performance with precision@10 scores of 0.9-1.0 and classification accuracies of 0.85-0.98 when tested on the PH<sup>2</sup>, Oxford Flowers, Corel-1k, Caltech-101 and Kvasir datasets. Notably, it outperforms current handcrafted, deep learning and hybrid approaches, achieving 1.0 precision@10 on four datasets and 0.96 accuracy on medical Kvasir images. Quantitative comparisons show 9%–14% higher precision than handcrafted methods, 3%–35% improvement over deep learning baselines, and 12% better than hybrid systems. This approach is especially promising for applications involving multimedia retrieval and medical imaging, where interpretability and accuracy are crucial.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70192","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145058071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimization of Module Transferability in Single Image Super-Resolution: Universality Assessment and Cycle Residual Blocks 单幅图像超分辨率中模块可转移性的优化:通用性评估和循环残差块
IF 2.2 4区 计算机科学
IET Image Processing Pub Date : 2025-09-13 DOI: 10.1049/ipr2.70206
Haotong Cheng, Zhiqi Zhang, Hao Li, Xinshang Zhang
{"title":"Optimization of Module Transferability in Single Image Super-Resolution: Universality Assessment and Cycle Residual Blocks","authors":"Haotong Cheng,&nbsp;Zhiqi Zhang,&nbsp;Hao Li,&nbsp;Xinshang Zhang","doi":"10.1049/ipr2.70206","DOIUrl":"10.1049/ipr2.70206","url":null,"abstract":"<p>Deep learning has substantially advanced the single image super-resolution (SISR). However, existing researches have predominantly focused on raw performance gains, with little attention paid to quantifying the transferability of architectural components. In this paper, we introduce the concept of “Universality” and its associated definitions which extend the traditional notion of “Generalization” to encompass the modules' ease of transferability. Then we propose the universality assessment equation (UAE), a metric which quantifies how readily a given module could be transplanted across models and reveals the combined influence of multiple existing metrics on transferability. Guided by the UAE results of standard residual blocks and other plug-and-play modules, we further design two optimized modules, cycle residual block (CRB) and depth-wise cycle residual block (DCRB). Through comprehensive experiments on natural-scene benchmarks, remote-sensing datasets and other low-level tasks, we demonstrate that networks embedded with the proposed plug-and-play modules outperform several state-of-the-arts, reaching a PSNR enhancement of up to 0.83 dB or enabling a 71.3% reduction in parameters with negligible loss in reconstruction fidelity. Similar optimization approaches could be applied to a broader range of basic modules, offering a new paradigm for the design of plug-and-play modules.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70206","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145038394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Learning and IoT Fusion for Crop Health Monitoring: A High-Accuracy, Edge-Optimised Model for Smart Farming 作物健康监测的深度学习和物联网融合:智能农业高精度、边缘优化模型
IF 2.2 4区 计算机科学
IET Image Processing Pub Date : 2025-09-12 DOI: 10.1049/ipr2.70208
Thomas Kinyanjui Njoroge, Edwin Juma Omol, Vincent Omollo Nyangaresi
{"title":"Deep Learning and IoT Fusion for Crop Health Monitoring: A High-Accuracy, Edge-Optimised Model for Smart Farming","authors":"Thomas Kinyanjui Njoroge,&nbsp;Edwin Juma Omol,&nbsp;Vincent Omollo Nyangaresi","doi":"10.1049/ipr2.70208","DOIUrl":"10.1049/ipr2.70208","url":null,"abstract":"<p>Crop diseases and adverse field conditions threaten global food security, particularly in resource-limited regions. Current deep-learning models for disease detection suffer from insufficient accuracy, high prediction instability under field noise, and a lack of integration with environmental context. To address these limitations, we present a hybrid deep learning architecture combining EfficientNetV2, MobileNetV2, and Vision Transformers, augmented with attention mechanisms and multiscale feature fusion. Optimised for edge deployment via TensorFlow Lite and integrated with IoT sensors for real-time soil and field monitoring, the model achieved state-of-the-art performance with 99.2% accuracy, 0.993 precision, 0.993 recall, and a near-perfect AUC of 0.999998, outperforming benchmarks like DenseNet50 (88.4%) and ShuffleNet (95.8%). Training on 76 classes (22 diseases) demonstrated rapid convergence and robustness, with validation accuracy reaching 98.7% and minimal overfitting. Statistical validation confirmed superior stability, with 69% lower prediction variance (0.000010) than DenseNet50 (0.000035), ensuring reliable performance under real-world noise. Bayesian testing showed a 100% probability of superiority over DenseNet50 and 85.1% over ShuffleNet, while field trials on 249 real-world images achieved 97.97% accuracy, highlighting strong generalisation. IoT integration reduced false diagnoses by 92% through environmental correlation, and edge optimisation enabled real-time inference via a 30.4 MB mobile application (0.094-second latency). This work advances precision agriculture through a scalable, cloud-independent framework that unifies hybrid deep learning with edge-compatible IoT sensing. By addressing critical gaps in accuracy, stability, and contextual awareness, the system enhances crop health management in low-resource settings, offering a statistically validated tool for sustainable farming practices.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70208","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145038070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信