Digital Signal Processing最新文献

筛选
英文 中文
An evaluation and study of detail contrast preservation and color consistency in decolorization 脱色中细节对比度保存和颜色一致性的评价与研究
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-15 DOI: 10.1016/j.dsp.2025.105468
Shenglin Peng , Tong Gao , Shuyi Qu , Zhe Yu , Jun Wang , Jinye Peng
{"title":"An evaluation and study of detail contrast preservation and color consistency in decolorization","authors":"Shenglin Peng ,&nbsp;Tong Gao ,&nbsp;Shuyi Qu ,&nbsp;Zhe Yu ,&nbsp;Jun Wang ,&nbsp;Jinye Peng","doi":"10.1016/j.dsp.2025.105468","DOIUrl":"10.1016/j.dsp.2025.105468","url":null,"abstract":"<div><div>Grayscale conversion plays a crucial role in image processing, particularly for edge detection and segmentation tasks, where decolorization quality directly impacts subsequent analysis. An ideal decolorization algorithm should be both efficient and robust while preserving color consistency and detail contrast. In this study, we revisit the RTCP (Real-time Contrast-Preserving Decolorization) algorithm and propose three key optimizations: a clustering-guided decolorization approach, a locally adaptive decolorization strategy, and a weight-optimized decolorization method. To enhance solution quality, we implement a constrained particle swarm optimization framework to systematically explore the parameter space. Experimental validation on two standard datasets (Ĉadík and CSDD) demonstrates that our optimized methods handle diverse decolorization scenarios more effectively while maintaining competitive performance against existing approaches. Recognizing the limitations of current evaluation metrics in assessing detail contrast preservation, we introduce the D-C2G-SSIM metric for more accurate quantitative assessment. Comparative results show consistent improvements over the original RTCP algorithm, with the average D-C2G-SSIM score increasing from 0.8331 to 0.8442 on Ĉadík dataset and from 0.8696 to 0.8847 on the CSDD dataset, confirming the effectiveness of our approach.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105468"},"PeriodicalIF":2.9,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-modal information interaction of binocular predictive networks for RGBT tracking 双目预测网络在rbt跟踪中的跨模态信息交互
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-14 DOI: 10.1016/j.dsp.2025.105473
Jianming Chen , Dingjian Li , Xiangjin Zeng , Yaman Jing , Zhenbo Ren , Jianglei Di , Yuwen Qin
{"title":"Cross-modal information interaction of binocular predictive networks for RGBT tracking","authors":"Jianming Chen ,&nbsp;Dingjian Li ,&nbsp;Xiangjin Zeng ,&nbsp;Yaman Jing ,&nbsp;Zhenbo Ren ,&nbsp;Jianglei Di ,&nbsp;Yuwen Qin","doi":"10.1016/j.dsp.2025.105473","DOIUrl":"10.1016/j.dsp.2025.105473","url":null,"abstract":"<div><div>RGBT tracking aims to aggregate the information from both visible and thermal infrared modalities to achieve visual object tracking. Although plenty of RGBT tracking methods have been proposed, they usually lead to target loss or tracking drift due to the inability to effectively extract useful feature information contained in the multimodal information. To handle this problem, we propose a cross-modal information interaction binocular prediction network. Firstly, a deep, multi-branch feature extraction network is constructed based on Siamese networks to fully exploit the semantic features of images from different optical modalities. The designed image feature enhancement modules are utilized to effectively capture and enhance object features, thereby improving tracking performance. Secondly, a fusion scheme is developed to achieve bidirectional fusion of multimodal features, leveraging complementary cross-modal information to retain distinguishable object characteristics across different modalities. Finally, the anchor-free concept is introduced into the RGBT object tracking domain and combined with a Peak Adaptive Selection (PAS) module to design a binocular prediction network, making the tracker more flexible and versatile. Evaluation experiments conducted on three standard RGBT tracking datasets, namely GTOT, RGBT234, and LasHeR, demonstrate that the modifications made to the baseline Siamese network architecture are effective. The proposed tracker is competitive with existing state-of-the-art (SOTA) methods, achieving comparable results in terms of precision and success rate. The key advantage of the proposed method lies in the robust fusion of multimodal features and the flexibility introduced by the anchor-free prediction design, which contribute to the stability of the proposed tracker across various scenarios. Code is released at <span><span>https://github.com/JMChenl/RGBT-tracking.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105473"},"PeriodicalIF":2.9,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wavelet-based multi-level information compensation learning for visible-infrared person re-identification 基于小波的多层次信息补偿学习的可见红外人再识别
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-14 DOI: 10.1016/j.dsp.2025.105471
Haobiao Fan , Yanbing Chen , Yibo Chen , Zhixin Tie , Hao Sheng , Wei Ke
{"title":"Wavelet-based multi-level information compensation learning for visible-infrared person re-identification","authors":"Haobiao Fan ,&nbsp;Yanbing Chen ,&nbsp;Yibo Chen ,&nbsp;Zhixin Tie ,&nbsp;Hao Sheng ,&nbsp;Wei Ke","doi":"10.1016/j.dsp.2025.105471","DOIUrl":"10.1016/j.dsp.2025.105471","url":null,"abstract":"<div><div>The main challenge in cross-modal person re-identification (VI-ReID) is extracting discriminative features from different modalities. Most existing methods focus on minimizing modal differences but overlook the shallow modality-invariant information lost as network depth increases. To address this, we propose the Wavelet-based Multi-level Information Compensation (WMIC) learning method. At multiple network stages, we design an Information Compensation Block (ICB) that applies wavelet decomposition to deep features, producing four wavelet subbands to preserve modality-invariant details and enlarge the receptive field. These subbands are used to compute an attention matrix with shallow features, which is then applied to enhance shallow features' local information. Additionally, we represent each person image with two sets of embeddings by introducing a Wavelet Enhancement Block (WEB) to generate an additional embedding. Finally, we use a dual-branch center-guided loss to make the two embeddings complementary, thereby reducing the disparity between infrared and visible images. Extensive experiments on the SYSU-MM01, RegDB, and LLCM datasets demonstrate that WMIC outperforms existing methods.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105471"},"PeriodicalIF":2.9,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unified estimation method for target number and angle-range parameters in FDA-MIMO radar under impulse noise environments 脉冲噪声环境下FDA-MIMO雷达目标数和角距参数的统一估计方法
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-14 DOI: 10.1016/j.dsp.2025.105463
Menghan Chen, Hongyuan Gao, Yapeng Liu, Helin Sun
{"title":"A unified estimation method for target number and angle-range parameters in FDA-MIMO radar under impulse noise environments","authors":"Menghan Chen,&nbsp;Hongyuan Gao,&nbsp;Yapeng Liu,&nbsp;Helin Sun","doi":"10.1016/j.dsp.2025.105463","DOIUrl":"10.1016/j.dsp.2025.105463","url":null,"abstract":"<div><div>In the presence of impulse noise, estimating both the target number and angle-range parameters of frequency diverse array multiple-input multiple-output (FDA-MIMO) radars is a recognized challenge. In this paper, a novel unified method is introduced for simultaneously estimating both the angle-range parameter and target number of FDA-MIMO radars under impulse noise. To mitigate the effects of impulse noise, a novel adaptive low-order covariance (ALC) method that requires no prior information is proposed. Additionally, a two-dimensional spatial spectrum function is derived using the ALC-MVDR method, leveraging the ability of the minimum-variance distortionless response (MVDR) beamformer to provide a two-dimensional spatial spectrum when the target number is unknown. To resolve the two-dimensional spatial spectrum function, the multimodal quantum sunflower optimization algorithm (MQSFOA) is introduced, which can effectively identify the number of spectral peaks and estimate the peak location without quantization errors. The interrelated Cramér–Rao bound (CRB) is then deduced for evaluating the developed method under conditions of impulse noise. Comparative simulation studies demonstrate the significant performance improvements of the presented method.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105463"},"PeriodicalIF":2.9,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kernel Gaussian processes based extended target tracking in polar coordinate 基于核高斯过程的极坐标扩展目标跟踪
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-14 DOI: 10.1016/j.dsp.2025.105462
Dongsheng Yang , Yunfei Guo , Hoseok Sul , Jee Woong Choi , Taek Lyul Song
{"title":"Kernel Gaussian processes based extended target tracking in polar coordinate","authors":"Dongsheng Yang ,&nbsp;Yunfei Guo ,&nbsp;Hoseok Sul ,&nbsp;Jee Woong Choi ,&nbsp;Taek Lyul Song","doi":"10.1016/j.dsp.2025.105462","DOIUrl":"10.1016/j.dsp.2025.105462","url":null,"abstract":"<div><div>Most of the traditional extended target tracking (ETT) methods struggle with the strong nonlinearity introduced by the polar coordinate measurements and the unknown maneuvering motion model. These factors either lead to high approximation errors or impose a high computational cost, making accurate and efficient tracking challenging. To address these problems, a kernel Gaussian process-based extended target tracking (KGP-ETT) algorithm is proposed. First, the kernel mean embedding (KME) algorithm embeds the posterior distribution into a high-dimensional reproducing kernel Hilbert space (RKHS) and propagates the state particles through the nonlinear motion model, thereby effectively capturing the inherent nonlinearity. Second, based on the KME method, a kernel-based measurement update is proposed to estimate the target state in a linearized manner by integrating kernel techniques into the Gaussian process (GP) framework. Finally, the computational complexity and the theoretical posterior Cramér-Rao lower bound (PCRLB) of the proposed algorithm are analyzed. Simulation and real-world experiments demonstrate that, during target maneuvering, KGP-ETT achieves up to 77% reduction in centroid root mean square error (RMSE), 64% reduction in extent RMSE, and a 148% improvement in intersection of union (IoU) compared to state-of-the-art GP and Variational Bayesian (VB) methods. These results highlight the robustness and accuracy of KGP-ETT in handling complex nonlinear ETT problems in polar coordinates.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105462"},"PeriodicalIF":2.9,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MBDBFormer: a multimodal bridge dual branch Transformer for person re-identification MBDBFormer:用于人员再识别的多模态桥式双支路变压器
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-11 DOI: 10.1016/j.dsp.2025.105481
Xiangyu Deng, Jing Ding
{"title":"MBDBFormer: a multimodal bridge dual branch Transformer for person re-identification","authors":"Xiangyu Deng,&nbsp;Jing Ding","doi":"10.1016/j.dsp.2025.105481","DOIUrl":"10.1016/j.dsp.2025.105481","url":null,"abstract":"<div><div>A key challenge in the person re-identification (ReID) task is the extraction of robust and discriminative pedestrian features. However, the sensitivity of RGB images in complex scenes to light, viewing angle differences and occlusion seriously affect the stability of feature extraction. To address the above problems, we propose a Multimodal Bridge Dual Branch Transformer (MBDBFormer) by combining CNN and Transformer. First, we use the luminance component in RGB and IHS with its frequency domain transformed low and high frequency components (I) as multimodal inputs for image preprocessing, so that the network takes into account both light adaptation and color information. Second, to effectively fuse the feature advantages of the two modalities, the preprocessed image information is input into a bridge branch network consisting of a multilayer down sampling network, and outputs one global and four local feature information through Transformer coding. Finally, using the dynamic allocation of attention weights, focusing on strengthening the feature expression of discriminative regions such as edges and textures, we designed the Gated Dynamic Attention and Feature Interaction Mechanism (GDFM), which establishes the long-range dependency between RGB and I feature, and achieves the complementary optimization of the two modal features. It enables the output fusion features to retain the rich color information of the RGB modality while incorporating the illumination robustness of the I modality. A number of experimental results show that our proposed method is better than the state-of-the-art method on the Market1501, DukeMTMC, MSMT17 generalized dataset and the Occluded-Duke occlusion dataset, which verifies the effectiveness of our method on the task of person re-identification.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105481"},"PeriodicalIF":2.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
YOLO-MFDNet: An object detection algorithm for multi-scale remote sensing images 多尺度遥感图像目标检测算法YOLO-MFDNet
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-11 DOI: 10.1016/j.dsp.2025.105479
Renhao Jiao , Rui Fan , Weigui Nan , Ming Lu , Xiaojia Yang , Zhiqiang Zhao , Jin Dang , Yanshan Tian , Baiying Dong , Xiaowei He , Xiaoli Luo
{"title":"YOLO-MFDNet: An object detection algorithm for multi-scale remote sensing images","authors":"Renhao Jiao ,&nbsp;Rui Fan ,&nbsp;Weigui Nan ,&nbsp;Ming Lu ,&nbsp;Xiaojia Yang ,&nbsp;Zhiqiang Zhao ,&nbsp;Jin Dang ,&nbsp;Yanshan Tian ,&nbsp;Baiying Dong ,&nbsp;Xiaowei He ,&nbsp;Xiaoli Luo","doi":"10.1016/j.dsp.2025.105479","DOIUrl":"10.1016/j.dsp.2025.105479","url":null,"abstract":"<div><div>In the field of remote sensing image target detection, although there have been many research progresses, problems such as complex background and multi-scale changes are still prominent. To this end, this paper proposes a new detection network - YOLO-MFDNet, which aims to enhance the multi-scale target perception ability and improve the detection accuracy. The network includes three key innovations: multi-scale spatial attention (MSSA) mechanism, flexible scaling down sampling (FSDown) mechanism and distance extended IOU (DXIOU) loss function. MSSA combines multi-scale feature fusion and dual-space dimension one-dimensional coding to enhance the spatial representation ability of the target and efficiently integrate multi-scale information. FSDown combines the advantages of depthwise separable convolution, dilated convolution and residual connection to improve the receptive field while maintaining the sensitivity to detail features, taking into account the detection accuracy and computational efficiency. The DXIOU loss function effectively reduces the risk of false detection and missed detection by introducing scale difference modeling. In this paper, the effectiveness of YOLO-MFDNet is verified on three public remote sensing datasets. On the DOTA v2.0 dataset, the mAP50 of YOLO-MFDNet is 2.7 % higher than that of the benchmark model; increase by 1.1 % on the DIOR dataset; it is improved by 6 % on the RSOD dataset, surpassing multiple existing models. In the case of little change in the number of parameters, YOLO-MFDNet shows higher detection accuracy on multiple data sets, which verifies its advantages in improving detection performance under the premise of ensuring computational efficiency. The source code will be available at <span><span>https://github.com/stevenjiaojiao/YOLO-MFDNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105479"},"PeriodicalIF":2.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PA-NAFNet: An improved nonlinear activation free network with pyramid attention for single image reflection removal PA-NAFNet:一种改进的具有金字塔关注的非线性无激活网络,用于单幅图像反射去除
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-11 DOI: 10.1016/j.dsp.2025.105474
Qing Zhang , Yizhong Zhang , Xu Kuang , Yuanbo Zhou , Tong Tong
{"title":"PA-NAFNet: An improved nonlinear activation free network with pyramid attention for single image reflection removal","authors":"Qing Zhang ,&nbsp;Yizhong Zhang ,&nbsp;Xu Kuang ,&nbsp;Yuanbo Zhou ,&nbsp;Tong Tong","doi":"10.1016/j.dsp.2025.105474","DOIUrl":"10.1016/j.dsp.2025.105474","url":null,"abstract":"<div><div>Single Image Reflection Removal (SIRR) is an active topic in low-level vision, aiming to eliminate the influence of reflected objects or light sources on image quality. However, due to the ill-posed property of SIRR and the lack of large-scale real world reflection image datasets, existing methods degrade on real datasets and suffer from the problem of reflection residue. To address these issues, we propose an effective SIRR network called PA-NAFNet. It utilizes a non-linear activation-free network (NAFNet) as the baseline and incorporates a pyramid attention module to capture long-range pixel interactions. Additionally, during the training phase, color jittering technique is introduced to increase the diversity of the training dataset, thereby alleviating potential color distortion issues after reflection removal. Experimental results on multiple reflection removal benchmark tests demonstrate the effectiveness of PA-NAFNet. The relevant code is available on this <span><span>link</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105474"},"PeriodicalIF":2.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-task learning for accurate and efficient nucleus instance segmentation based on ordinal regression 基于有序回归的多任务学习核实例分割
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-11 DOI: 10.1016/j.dsp.2025.105475
Ziwei Chen , Jingyi Li , Liangzhe Zhang, Mingyang Fu
{"title":"Multi-task learning for accurate and efficient nucleus instance segmentation based on ordinal regression","authors":"Ziwei Chen ,&nbsp;Jingyi Li ,&nbsp;Liangzhe Zhang,&nbsp;Mingyang Fu","doi":"10.1016/j.dsp.2025.105475","DOIUrl":"10.1016/j.dsp.2025.105475","url":null,"abstract":"<div><div>Nucleus instance segmentation is a critical prerequisite in many microscopy-related research fields, including pathology, drug discovery and functional genomics. The biological tasks involved depend on highly accurate and readily available nucleus segmentation results. However, both manual and existing computer-assisted methods face challenges in balancing accuracy and efficiency due to the diverse sizes, shapes and morphologies of nuclei. Additionally, some nuclei are often clustered and overlapping, which imposes higher demands on segmentation methods. Here, we present an ordinal regression-based nucleus instance segmentation method with multi-task learning that leverages rich instance-aware information encoded within spatial-based ordinal rankings. These ordinal rankings are generated and predicted by our proposed Distance Grading Decrease (DGD) strategy and EfficientNet-based lightweight network, W-Net, respectively. Combined with pixel-level foreground probabilities, these rankings are utilized to separate clustered nuclei and achieve accurate segmentation through a marker-controlled watershed algorithm. Our method demonstrates state-of-the-art accuracy and efficiency compared to others, as validated on two independent multi-tissue histology image datasets.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105475"},"PeriodicalIF":2.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144679555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KANSeg: An efficient medical image segmentation model based on Kolmogorov-Arnold networks for multi-organ segmentation KANSeg:一种基于Kolmogorov-Arnold网络的多器官医学图像分割模型
IF 2.9 3区 工程技术
Digital Signal Processing Pub Date : 2025-07-11 DOI: 10.1016/j.dsp.2025.105472
Junan Zhu, Zhizhe Tang, Zheng Liang, Ping Ma, Chuanjian Wang
{"title":"KANSeg: An efficient medical image segmentation model based on Kolmogorov-Arnold networks for multi-organ segmentation","authors":"Junan Zhu,&nbsp;Zhizhe Tang,&nbsp;Zheng Liang,&nbsp;Ping Ma,&nbsp;Chuanjian Wang","doi":"10.1016/j.dsp.2025.105472","DOIUrl":"10.1016/j.dsp.2025.105472","url":null,"abstract":"<div><div>Currently, multi-organ segmentation methods based on convolution neural networks have achieved milestones in medical image analysis. However, there are some challenging issues such as the complex background, blurred boundaries between organs. These lead to poor boundary segmentation. To address this issue, we propose a multi-organ segmentation method based on Kolmogorov-Arnold Networks (KAN), called KANSeg. We develop a KAN-Activated Convolution module (KAN-ACM) to construct both the encoder and decoder, thereby enhancing the learning and interpretation of intricate patterns within multi-organ images. Moreover, to further augment the model's ability to represent nonlinear features, we design a KAN bottleneck module (KAN-BM) to extract more discriminative features. Finally, we conduct comprehensive experiments on two datasets. The proposed KANSeg can achieve Dice Score of 79.95%, 90.99% on the Synapse multi-organ dataset (Synapse) and the Automated cardiac diagnosis challenge (ACDC) datasets. The outcomes demonstrate that our method yields more accurate segmentation results compared with state-of-the-art methods.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105472"},"PeriodicalIF":2.9,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信