IET Image Processing最新文献

筛选
英文 中文
More Realistic Edges, Textures, and Colors for Image Non-Homogeneous Dehazing 更逼真的边缘,纹理和颜色的图像非均匀去雾
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-04-20 DOI: 10.1049/ipr2.70079
Hairu Guo, Yaning Li, Zhanqiang Huo, Shan Zhao, Yingxu Qiao
{"title":"More Realistic Edges, Textures, and Colors for Image Non-Homogeneous Dehazing","authors":"Hairu Guo,&nbsp;Yaning Li,&nbsp;Zhanqiang Huo,&nbsp;Shan Zhao,&nbsp;Yingxu Qiao","doi":"10.1049/ipr2.70079","DOIUrl":"https://doi.org/10.1049/ipr2.70079","url":null,"abstract":"<p>The existing image dehazing algorithms perform suboptimal in non-homogeneous and/or dense haze scenarios. The loss of feature information and alteration of color distribution cause images to deviate from real-world scenes when haze suppresses image details. To address these issues, we design a dual-branch non-homogeneous dehazing network integrating discrete wavelet transform (DWT), multi-scale feature fusion, and color constraints to achieve dehazed images with more realistic edges, textures, and colors. Specifically, we first introduce DWT into a multi-scale encoder–decoder network structure to capture more details and edge information. Then, a feature supplement and enhancement module (FSEM) combining features from hazy images at different scales and features from the previous stage is devised to enhance the multi-scale feature capture capability of rich textures in complex scenes. Finally, we propose a pixel-wise color consistency loss that combines pixel similarity and angular difference to constrain the dehazed images to closely match the color distribution of clear images. Experimental results indicate that the proposed dehazing network outperforms the state-of-the-art non-homogeneous dehazing methods on relevant public benchmarks and has more realistic edges, textures, and colors.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70079","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143852840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DPFA-UNet: Dual-Path Fusion Attention for Accurate Brain Tumor Segmentation DPFA-UNet:用于准确脑肿瘤分类的双路径融合注意力
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-04-20 DOI: 10.1049/ipr2.70084
Jing Sha, Xu Wang, Zhongyuan Wang, Lu Wang
{"title":"DPFA-UNet: Dual-Path Fusion Attention for Accurate Brain Tumor Segmentation","authors":"Jing Sha,&nbsp;Xu Wang,&nbsp;Zhongyuan Wang,&nbsp;Lu Wang","doi":"10.1049/ipr2.70084","DOIUrl":"https://doi.org/10.1049/ipr2.70084","url":null,"abstract":"<p>Gliomas are the most common primary brain tumors within the central nervous system, typically observed through magnetic resonance imaging (MRI). Precise segmentation of brain tumor in MRI is highly significant for both clinical diagnosis and treatment. However, due to complexity of tumor structures, existing deep-learning-based methods for brain tumor segmentation still face challenges in accurately delineating tumor core (TC) and enhancing tumor (ET) regions, which are primary targets for actual treatment. To address this problem, this work proposes dual-path fusion attention-based UNet (DPFA-UNet) that leverages a dual-path attention block (DPA) and a concurrent attention fusion block (CAF) within a U-shaped architecture. Specifically, DPA enhances adaptability to lesions of varying sizes by using multi-scale branches that capture fine details and global features. CAF fuses high- and low-level semantic features using a parallel attention mechanism, effectively focusing on the focal regions. It also incorporates a mask generated by deep supervision mechanism to further guide feature fusion. Additionally, to reduce demand for hardware resources, we incorporate depthwise separable convolution into the model. Experiments are conducted on public BraTS 2021 and BraTS 2019 datasets. The results verify that DPFA-UNet outperforms existing brain tumor segmentation methods, particularly in segmenting TC and ET regions.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70084","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143852839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image Compression Model Based on Dynamic Convolution and Vision Mamba 基于动态卷积和视觉曼巴的图像压缩模型
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-04-17 DOI: 10.1049/ipr2.70080
Lingchen Qiu, Enjian Bai, Yun Wu, Yuwen Cao, Xue-qin Jiang
{"title":"Image Compression Model Based on Dynamic Convolution and Vision Mamba","authors":"Lingchen Qiu,&nbsp;Enjian Bai,&nbsp;Yun Wu,&nbsp;Yuwen Cao,&nbsp;Xue-qin Jiang","doi":"10.1049/ipr2.70080","DOIUrl":"https://doi.org/10.1049/ipr2.70080","url":null,"abstract":"<p>We propose an efficient image compression scheme leveraging Vision Mamba and dynamic convolution, addressing the limitations of existing methods, such as failure to capture long-range pixel dependencies and high computational complexity. Our approach improves both global and local information learning with reduced computational cost. Experimental results on the Kodak, Tecnick and CLIC datasets show that our model achieves competitive performance with lower algorithm complexity. Our code is available on: https://github.com/Lynxsx/ICVM.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70080","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143846017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating Linear Skip-Attention With Transformer-Based Network of Multi-Level Features Extraction for Partial Point Cloud Registration 结合线性跳过注意和基于变压器的多层次特征提取网络的局部点云配准
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-04-15 DOI: 10.1049/ipr2.70055
Qinyu He, Tao Sun
{"title":"Integrating Linear Skip-Attention With Transformer-Based Network of Multi-Level Features Extraction for Partial Point Cloud Registration","authors":"Qinyu He,&nbsp;Tao Sun","doi":"10.1049/ipr2.70055","DOIUrl":"https://doi.org/10.1049/ipr2.70055","url":null,"abstract":"<p>Accurate point correspondences is critical for rigid point cloud registration in correspondence-based methods. Many previous learning-based methods employ encoder-decoder backbone for point feature extraction, while applying attention mechanism for sparse superpoints to deal with the partial overlap situation. However, few of these methods focus on the intermediate layers yet mainly pay attention on the top-most patch features, thus neglecting multi-faceted feature perspectives leading to potential overlap areas estimation inaccuracy. Meanwhile, obtaining correct correspondences is usually interfered with the one-to-many case and outliers. To address these issues, we propose a multi-level features extraction network with integrating linear dual attention mechanism into skip-connection stage of encoder-decoder backbone, both efficiently suppressing irrelevant information and guiding residual features to learn the common regions on which the network should focus to tackle the overlap estimation inaccuracy issue, combined with a parallel-structured decoder forming distinguishable features and potential overlapping regions. Additionally, a two-stage correspondences pruning process is designed to tackle the mismatch issue, which mainly depends on the rigid geometric constraint. Extensive experiments conducted on indoor and outdoor scene datasets demonstrate our method's accuracy and stability, by outperforming state-of-the-art methods on registration recall.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70055","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143835865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Edge-Enhanced Feature Pyramid SwinUNet: Advanced Segmentation of Lung Nodules in CT Images 边缘增强特征金字塔SwinUNet: CT图像肺结节的高级分割
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-04-14 DOI: 10.1049/ipr2.70072
Akila Agnes S, Arun Solomon A, K Karthick, Mejdl Safran, Sultan Alfarhood
{"title":"Edge-Enhanced Feature Pyramid SwinUNet: Advanced Segmentation of Lung Nodules in CT Images","authors":"Akila Agnes S,&nbsp;Arun Solomon A,&nbsp;K Karthick,&nbsp;Mejdl Safran,&nbsp;Sultan Alfarhood","doi":"10.1049/ipr2.70072","DOIUrl":"https://doi.org/10.1049/ipr2.70072","url":null,"abstract":"<p>In the field of oncology, lung cancer is a leading contributor to cancer-related mortality, highlighting the need for early detection of lung nodules for effective intervention. However, accurate segmentation of lung nodules in Computed Tomography (CT) images remains a significant challenge due to issues such as heterogeneous nodule dimensions, low contrast, and their visual similarity with surrounding tissues. To address these challenges, this study proposes the Edge-Enhanced Feature Pyramid SwinUNet (EE-FPS-UNet), an advanced segmentation model that integrates a modified Swin Transformer with a feature pyramid network (FPN). The research objective is to enhance boundary delineation and multi-scale feature aggregation for improved segmentation performance. The proposed model uses the Swin Transformer to capture long-range dependencies and integrates an FPN for robust multi-scale feature aggregation. Its window-based self-attention mechanism also reduces computational complexity, making it well-suited for high-resolution CT images. Additionally, an edge detection module enhances segmentation by providing edge-related features to the decoder, improving boundary precision. A comparative analysis evaluates the EE-FPS-UNet against leading models, including PSPNet, U-Net, Attention U-Net, and DeepLabV3. The results demonstrate that the proposed model outperforms these models, achieving a Dice Similarity of 0.91 and a sensitivity of 0.89, establishing its efficacy for lung nodule segmentation.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70072","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143831451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comprehensive Review of U-Net and Its Variants: Advances and Applications in Medical Image Segmentation U-Net及其变体在医学图像分割中的研究进展及应用综述
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-04-14 DOI: 10.1049/ipr2.70019
Wang Jiangtao, Nur Intan Raihana Ruhaiyem, Fu Panpan
{"title":"A Comprehensive Review of U-Net and Its Variants: Advances and Applications in Medical Image Segmentation","authors":"Wang Jiangtao,&nbsp;Nur Intan Raihana Ruhaiyem,&nbsp;Fu Panpan","doi":"10.1049/ipr2.70019","DOIUrl":"https://doi.org/10.1049/ipr2.70019","url":null,"abstract":"<p>Medical images often exhibit low and blurred contrast between lesions and surrounding tissues, with considerable variation in lesion edges and shapes even within the same disease, leading to significant challenges in segmentation. Therefore, precise segmentation of lesions has become an essential prerequisite for patient condition assessment and formulation of treatment plans. Significant achievements have been made in research related to the U-Net model in recent years. It improves segmentation performance and is extensively applied in the semantic segmentation of medical images to offer technical support for consistent quantitative lesion analysis methods. First, this paper classifies medical image datasets on the basis of their imaging modalities and then examines U-Net and its various improvement models from the perspective of structural modifications. The research objectives, innovative designs, and limitations of each approach are discussed in detail. Second, we summarise the four central improvement mechanisms of the U-Net and U-Net variant algorithms: the jump-connection mechanism, the residual-connection mechanism, 3D-UNet, and the transformer mechanism. Finally, we examine the relationships among the four core enhancement mechanisms and commonly utilized medical datasets and propose potential avenues and strategies for future advancements. This paper provides a systematic summary and reference for researchers in related fields, and we look forward to designing more efficient and stable medical image segmentation network models based on the U-Net network.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143831452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breaking Tradition With Perception: Debiasing Strategies in Cloth-Changing Person Re-Identification 以感知打破传统:换衣人再认同中的去偏见策略
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-04-11 DOI: 10.1049/ipr2.70075
YiPeng Yin, Jian Wu, Bo Li
{"title":"Breaking Tradition With Perception: Debiasing Strategies in Cloth-Changing Person Re-Identification","authors":"YiPeng Yin,&nbsp;Jian Wu,&nbsp;Bo Li","doi":"10.1049/ipr2.70075","DOIUrl":"https://doi.org/10.1049/ipr2.70075","url":null,"abstract":"<p>Person ReID aims to match images of individuals captured from different camera views for identity retrieval. Traditional ReID methods primarily rely on clothing features, assuming that individuals do not change clothes in a short time frame. This assumption significantly reduces recognition accuracy when clothing changes, particularly in long-term ReID tasks cloth-changing person re-identification (CC-ReID). Thus, achieving effective re-identification in clothing-change scenarios has become a critical challenge. This paper proposes an automatic perception model (APM) to address the break posed by clothing changes. The model uses a dual-branch with a dynamic perception learning (DPL) strategy and a perception branch, minimizing the bias introduced by clothing on identity recognition while preserving semantic features. The DPL strategy dynamically adjusts training weights to enhance the model's ability to learn from varying sample difficulties and feature distributions. The perception branch captures deeper feature relationships, alleviating the impact of clothing bias and improving the model's ability to distinguish intra-class transformations. Validated on Celeb-Reid and Celeb-Reid-light datasets, APM achieves a mean average precision (mAP) of 22.6% and 25.9%, with Rank-1 accuracy of 77.3% and 79.5%, respectively. It also excels in short-term ReID, achieving 90% mAP and 96.3% Rank-1 accuracy on Markt1501, demonstrating robustness across scenarios.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70075","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143818711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variational Weighted ℓ p − ℓ q $ell _p-ell _q$ Regularization for Hyperspectral Image Restoration Under Mixed Noise 混合噪声下高光谱图像恢复的变分加权∑p−∑q$ ell _p-ell _q$正则化
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-04-11 DOI: 10.1049/ipr2.70073
Hazique Aetesam, V. B. Surya Prasath
{"title":"Variational Weighted \u0000 \u0000 \u0000 \u0000 ℓ\u0000 p\u0000 \u0000 −\u0000 \u0000 ℓ\u0000 q\u0000 \u0000 \u0000 $ell _p-ell _q$\u0000 Regularization for Hyperspectral Image Restoration Under Mixed Noise","authors":"Hazique Aetesam,&nbsp;V. B. Surya Prasath","doi":"10.1049/ipr2.70073","DOIUrl":"https://doi.org/10.1049/ipr2.70073","url":null,"abstract":"&lt;p&gt;In this paper, we propose to use weighted &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;msub&gt;\u0000 &lt;mi&gt;ℓ&lt;/mi&gt;\u0000 &lt;mn&gt;2&lt;/mn&gt;\u0000 &lt;/msub&gt;\u0000 &lt;annotation&gt;$ell _2$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;-norm for approximating the solution of general &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;msub&gt;\u0000 &lt;mi&gt;ℓ&lt;/mi&gt;\u0000 &lt;mi&gt;p&lt;/mi&gt;\u0000 &lt;/msub&gt;\u0000 &lt;mo&gt;−&lt;/mo&gt;\u0000 &lt;msub&gt;\u0000 &lt;mi&gt;ℓ&lt;/mi&gt;\u0000 &lt;mi&gt;q&lt;/mi&gt;\u0000 &lt;/msub&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$ell _p-ell _q$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;-norm regularization problem for recovering hyperspectral images (HSI) corrupted by a mixture of Gaussian-impulse noise. As a special case of &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;p&lt;/mi&gt;\u0000 &lt;mo&gt;,&lt;/mo&gt;\u0000 &lt;mi&gt;q&lt;/mi&gt;\u0000 &lt;mo&gt;∈&lt;/mo&gt;\u0000 &lt;mo&gt;{&lt;/mo&gt;\u0000 &lt;mn&gt;1&lt;/mn&gt;\u0000 &lt;mo&gt;,&lt;/mo&gt;\u0000 &lt;mn&gt;2&lt;/mn&gt;\u0000 &lt;mo&gt;}&lt;/mo&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$p,qin lbrace 1,2rbrace$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;, we design an optimization framework to accommodate the combined effect of different noise sources. An initial impulse noise pre-detection phase decouples the raw noisy HSI data into impulse and Gaussian corrupted pixels. Gaussian corrupted pixels are handled by data-fidelity term in &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;msub&gt;\u0000 &lt;mi&gt;ℓ&lt;/mi&gt;\u0000 &lt;mn&gt;2&lt;/mn&gt;\u0000 &lt;/msub&gt;\u0000 &lt;mo&gt;−&lt;/mo&gt;\u0000 &lt;mi&gt;norm&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$ell _2-{rm norm}$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt; while impulse corrupted pixels possess more Laplacian like behavior; modeled using &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;msub&gt;\u0000 &lt;mi&gt;ℓ&lt;/mi&gt;\u0000 &lt;mn&gt;1&lt;/mn&gt;\u0000 &lt;/msub&gt;\u0000 &lt;mo&gt;−&lt;/mo&gt;\u0000 &lt;mi&gt;norm&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$ell _1-{rm norm}$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;. Solutions of problems involving &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;msub&gt;\u0000 &lt;mi&gt;ℓ&lt;/mi&gt;\u0000 &lt;mn&gt;1&lt;/mn&gt;\u0000 &lt;/msub&gt;\u0000 &lt;mo&gt;−&lt;/mo&gt;\u0000 &lt;mi&gt;norm&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$ell _1-{rm norm}$&lt;/annotation&gt;\u0000 &lt;/s","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70073","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143822082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on the Diagnosis Method of Pancreatic Lesions by Endoscopic Ultrasound Based on Twin Network Structure 基于双子网络结构的内窥镜超声胰腺病变诊断方法研究
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-04-11 DOI: 10.1049/ipr2.70063
Xiao Xin, Huang Danping, Hu Shanshan, Shen Yang
{"title":"Research on the Diagnosis Method of Pancreatic Lesions by Endoscopic Ultrasound Based on Twin Network Structure","authors":"Xiao Xin,&nbsp;Huang Danping,&nbsp;Hu Shanshan,&nbsp;Shen Yang","doi":"10.1049/ipr2.70063","DOIUrl":"https://doi.org/10.1049/ipr2.70063","url":null,"abstract":"<p>To tackle the endoscopic ultrasound (EUS) pancreatic visual information, we propose a novel twin diagnostic network architecture (TDN) which consists of two identical feature extraction network structures. Model 1 is used to distinguish the categories of pancreatic visual information. Model 2 distinguishes whether there is cancerous information in pancreatic visual information. If cancerous information is included, gradient-weighted class activation mapping (Grad-CAM) is employed to calculate the activation heat map of visual information to present the specific location of the cancerous area in the visual information. To effectively integrate detailed texture information with abstract semantic information, we find the optimal proportion relationship required for feature fusion in each stage output feature vector dimension. The experimental results show that the classification accuracy of the TDN network can reach 98.344% for the pancreatic part and 99.471% for the specific part of the pancreas whether canceration occurs.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70063","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143818710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Improved Object Detection Algorithm for UAV Images Based on Orthogonal Channel Attention Mechanism and Triple Feature Encoder 基于正交信道注意机制和三特征编码器的改进无人机图像目标检测算法
IF 2 4区 计算机科学
IET Image Processing Pub Date : 2025-04-10 DOI: 10.1049/ipr2.70061
Wenfeng Wang, Chaomin Wang, Sheng Lei, Min Xie, Binbin Gui, Fang Dong
{"title":"An Improved Object Detection Algorithm for UAV Images Based on Orthogonal Channel Attention Mechanism and Triple Feature Encoder","authors":"Wenfeng Wang,&nbsp;Chaomin Wang,&nbsp;Sheng Lei,&nbsp;Min Xie,&nbsp;Binbin Gui,&nbsp;Fang Dong","doi":"10.1049/ipr2.70061","DOIUrl":"https://doi.org/10.1049/ipr2.70061","url":null,"abstract":"<p>Object detection in Unmanned Aerial Vehicle (UAV) imagery plays an important role in many fields. However, UAV images usually exhibit characteristics different from those of natural images, such as complex scenes, dense small targets, and significant variations in target scales, which pose considerable challenges for object detection tasks. To address these issues, this paper presents a novel object detection algorithm for UAV images based on YOLOv8 (referred to as OATF-YOLO). First, an orthogonal channel attention mechanism is added to the backbone network to imporve the algorithm's ability to extract features and clear up any confusion between features in the foreground and background. Second, a triple feature encoder and a scale sequence feature fusion module are integrated into the neck network to bolster the algorithm's multi-scale feature fusion capability, thereby mitigating the impact of substantial differences in target scales. Finally, an inner factor is introduced into the loss function to further upgrade the robustness and detection accuracy of the algorithm. Experimental results on the VisDrone2019-DET dataset indicate that the proposed algorithm significantly outperforms the baseline model. On the validation set, the OATF-YOLO algorithm achieves a precision of 59.1%, a recall of 40.5%, an mAP50 of 42.5%, and an mAP50:95 of 25.8%. These values represent improvements of 3.8%, 3.0%, 4.1%, and 3.3%, respectively. Similarly, on the test set, the OATF-YOLO algorithm achieves a precision of 52.3%, a recall of 34.7%, an mAP50 of 33.4%, and an mAP50:95 of 19.1%, reflecting enhancements of 4.0%, 3.3%, 4.0%, and 2.6%, respectively. To further validate the model's robustness and scalability, experiments are conducted on the NWPU-VHR10 dataset, and OATF-YOLO also achieves excellent performance. Furthermore, compared to several classical object detection algorithms, OATF-YOLO demonstrates superior detection performance on both datasets and indicates that it is better suited for UAV image object detection scenarios.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70061","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143818413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信