Knowledge-Based Systems最新文献_第2页

Coordinated LLM multi-agent systems for collaborative question-answer generation 协同问答生成的LLM多智能体系统

IF 7.6 1区计算机科学

Knowledge-Based Systems Pub Date : 2025-10-13 DOI: 10.1016/j.knosys.2025.114627

Sami Saadaoui, Eduardo Alonso

{"title":"Coordinated LLM multi-agent systems for collaborative question-answer generation","authors":"Sami Saadaoui, Eduardo Alonso","doi":"10.1016/j.knosys.2025.114627","DOIUrl":"10.1016/j.knosys.2025.114627","url":null,"abstract":"<div><div>Large Language Models (LLMs) excel at generating coherent and human-like questions and answers (QAs) across various topics, which can be utilized in various applications. However, their performance may be limited in domain-specific knowledge outside their training data, potentially resulting in low context recall or factual inconsistencies. This is particularly true in highly technical or specialized domains that require deep comprehension and reasoning beyond surface-level content. To address this, we propose <strong>C</strong>ollective <strong>I</strong>ntentional <strong>R</strong>eading through <strong>R</strong>eflection and <strong>R</strong>efinement (<strong>CIR3</strong>), a novel multi-agent framework that leverages collective intelligence for high quality Question-Answer Generation (QAG) from domain-specific documents. CIR3 employs a transactive reasoning mechanism to facilitate efficient communication and information flow among agents. This enables for in-depth document analysis and the generation of comprehensive and faithful QAs. Additionally, multi-perspective assessment ensures that QAs are evaluated from various viewpoints, enhancing their quality and relevance. A balanced collective convergence process is employed to ensure that the agents reach a consensus on the generated QAs, preventing inconsistencies and improving overall coherence. Our experiments indicate a substantial level of alignment between the CIR3-generated QAs and corresponding documents, while improving comprehensiveness by 23 % and faithfulness by 17 % compared to strong baseline approaches. Code and data are available at <span><span>https://github.com/anonym-nlp-ai/cirrr</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114627"},"PeriodicalIF":7.6,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging LIME explainability and Gustafson-Kessel fuzzy clustering for resume grouping and text summarization 利用LIME可解释性和Gustafson-Kessel模糊聚类进行简历分组和文本摘要

IF 7.6 1区计算机科学

Knowledge-Based Systems Pub Date : 2025-10-13 DOI: 10.1016/j.knosys.2025.114621

Ravi Mudavath, Atul Negi

{"title":"Leveraging LIME explainability and Gustafson-Kessel fuzzy clustering for resume grouping and text summarization","authors":"Ravi Mudavath, Atul Negi","doi":"10.1016/j.knosys.2025.114621","DOIUrl":"10.1016/j.knosys.2025.114621","url":null,"abstract":"<div><div>Over the years a very large number of classification methods have been developed, which are now being referred to as “classical machine learning”. However, a noticeable gap remains in research linking unsupervised learning techniques with explainable artificial intelligence (XAI) methods. In this study, we address this gap by proposing a novel method to enhance the interpretability of unsupervised learning, particularly for textual data. We integrate XAI techniques with the Gustafson-Kessel (GK) fuzzy clustering algorithm to enhance the capture of semantic relationships in text, in particular, resumes for employment. Our approach leverages the light-weight Sentence-BERT model to generate contextual embeddings, that offer a deeper semantic understanding of resume data. These embeddings provide richer representations compared to traditional textual feature extraction methods. Similar resumes are clustered using the GK fuzzy clustering algorithm to identify common patterns across resumes. Subsequently, informative summaries are used for employment purposes, enhancing resume categorization and job matching. The GK algorithm, was chosen over others as it is especially effective at handling complex structures as compared to other clustering methods. In clustering practice, an evaluation of clustering quality is generally performed. In this work, we conduct statistical analysis, ablation studies, and assess performance using various clustering metrics. We also incorporate Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) to interpret the cluster memberships of individual resumes, thereby enhancing the transparency and trustworthiness of the clustering process. Our approach, when applied correctly, provides potential employers with clear and interpretable insights into how resumes are grouped. We present results on a resume data set that were summarized by our proposed method. The effective and interpretable clustering is shown in comparison with other clustering methods. The outcome is expected to improve the processing efficiency of applicant profiles and is helpful to human resource management.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114621"},"PeriodicalIF":7.6,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Structured guided diffusion models for industrial defect image generation 工业缺陷图像生成的结构化引导扩散模型

IF 7.6 1区计算机科学

Knowledge-Based Systems Pub Date : 2025-10-13 DOI: 10.1016/j.knosys.2025.114642

Yulai Xie , Xiaoning Pi , Yang Zhang , Fang Ren

{"title":"Structured guided diffusion models for industrial defect image generation","authors":"Yulai Xie , Xiaoning Pi , Yang Zhang , Fang Ren","doi":"10.1016/j.knosys.2025.114642","DOIUrl":"10.1016/j.knosys.2025.114642","url":null,"abstract":"<div><div>Industrial defect images exhibit distinct characteristics from natural images, including severe class imbalance and structured similarity and diversity. Current defect image generation methods often lack fine-grained control over defect elements and suffer from limited diversity. This paper presents the Structured Guided Diffusion Model (Structured-GDM) for generating high-quality defect images with independent control over three structured elements: normal backgrounds, defect classes, and defect shapes. Controllability enables the generation of high-diversity defect images by preserving normal background outlines with detailed variation, specifying defect classes and shapes, and guiding the generation of reasonable (single or combined) defects using prior or expert knowledge. The structured architecture separates the training and use of elemental diffusion, classification, and segmentation models in a building-block manner, offering improved flexibility and maintainability. Additionally, a multiple-class training scheme is proposed to train overall models for one-for-all multiple-class defect generation, which exploits the inter-class similarity of defects and simplifies implementation. Extensive experiments on multiple MVTec and NEU-DET demonstrate that the method achieves superior performance in both image quality metrics and down-stream tasks, while maintaining high diversity and structured controllability.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114642"},"PeriodicalIF":7.6,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Triple-view graph clustering network based on high-confidence contrastive learning strategy 基于高置信度对比学习策略的三视图图聚类网络

IF 7.6 1区计算机科学

Knowledge-Based Systems Pub Date : 2025-10-13 DOI: 10.1016/j.knosys.2025.114625

Shifei Ding , Zhe Li , Xiao Xu , Lili Guo , Ling Ding

{"title":"Triple-view graph clustering network based on high-confidence contrastive learning strategy","authors":"Shifei Ding , Zhe Li , Xiao Xu , Lili Guo , Ling Ding","doi":"10.1016/j.knosys.2025.114625","DOIUrl":"10.1016/j.knosys.2025.114625","url":null,"abstract":"<div><div>Recent contrastive deep clustering models have seen considerable success. However, many of these approaches often focus on distinguishing between nodes in two views for contrastive learning, which can pose significant difficulties when handling complex noisy nodes. Furthermore, numerous deep clustering models do not have a dependable framework for choosing positive and negative sample pairs. To tackle these challenges, we introduce the Triple-View Graph Clustering Network with a High-Confidence Contrastive Learning Strategy (TGCN-HCC). This model comprises two primary components. The first is a Triple-View fusion network that features parameter-shared Siamese encoders and a graph attention network, which produces semantically rich fused embeddings by combining embeddings from the three views. The second component is a self-supervised clustering module that utilizes high-confidence pseudo label screening. This module incorporates a loss function that uses high-confidence pseudo label to enhance the clustering process. Comprehensive experiments on five datasets indicate that our proposed model surpasses other clustering models in performance.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114625"},"PeriodicalIF":7.6,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145324958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Lightweight image super-resolution with tokenized dynamic embedding network 轻量级图像超分辨率标记化动态嵌入网络

IF 7.6 1区计算机科学

Knowledge-Based Systems Pub Date : 2025-10-13 DOI: 10.1016/j.knosys.2025.114640

Xiangyuan Zhu , Xuchong Liu , Zheng Wu

{"title":"Lightweight image super-resolution with tokenized dynamic embedding network","authors":"Xiangyuan Zhu , Xuchong Liu , Zheng Wu","doi":"10.1016/j.knosys.2025.114640","DOIUrl":"10.1016/j.knosys.2025.114640","url":null,"abstract":"<div><div>Image super-resolution is a crucial task in computer vision, aiming to reconstruct high-resolution images from low-resolution counterparts. Despite the remarkable progress of deep learning-based methods, existing approaches often face challenges in balancing reconstruction quality, computational efficiency, and model compactness. In this paper, we propose a novel tokenized dynamic embedding network, which integrates adaptive feature tokenization and dynamic embedding mechanisms to enhance super-resolution performance while maintaining efficiency. Specifically, we employ an adaptive feature tokenization strategy to selectively extract essential tokens, reducing computational complexity while preserving key image details. Additionally, we introduce a dynamic context embedding attention module for efficient long-range dependency modeling and a dual-perspective feature integration module for integrating spatial and contextual information, ensuring both fine-grained textures and global consistency. Extensive experiments on benchmark datasets demonstrate that our method outperforms state-of-the-art lightweight models in terms of objective metrics and perceptual quality, while maintaining a compact and efficient design suitable for real-world applications. The source code is available at <span><span>https://github.com/zxycs/TDEN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114640"},"PeriodicalIF":7.6,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

High-rank corrected multi-head self attention for image super resolution 图像超分辨率高阶校正多头自关注

IF 7.6 1区计算机科学

Knowledge-Based Systems Pub Date : 2025-10-11 DOI: 10.1016/j.knosys.2025.114637

Ying Yuan , Zihao Ren , Yajun Qiu , Bin Sun , Shihao Kou , Caiwen Jiang , Tianliang Zhang

{"title":"High-rank corrected multi-head self attention for image super resolution","authors":"Ying Yuan , Zihao Ren , Yajun Qiu , Bin Sun , Shihao Kou , Caiwen Jiang , Tianliang Zhang","doi":"10.1016/j.knosys.2025.114637","DOIUrl":"10.1016/j.knosys.2025.114637","url":null,"abstract":"<div><div>Recently, Transformer-based methods have shown impressive performance in image super resolution (SR) tasks, by exploiting multi-head self attention (MSA) to capture long-range dependencies between pixels. Unfortunately, there is a low-rank bottleneck in existing Transformer-based SR methods, which limits SR performance. We demonstrate that this is because the attention map in MSA is restricted to using more non-zero singular values to make stable representation. Increasing the projection dimension of MSA can eliminate the low-rank bottleneck, but results in overwhelming computational burden. Furthermore, we observe that the attention maps of different heads in MSA exhibit both information redundancy and complementarity. Based on these findings, we propose High-Rank Corrected Multi-Head Self-Attention (HR-MSA) to capture precise dependency information by high-rank attention maps without introducing additional computational burden. Our HR-MSA first utilizes the complete information of each pixel to compute an unabridged high-rank dependency. Then it independently applies linear corrections to different heads and achieves a high-rank weighted pixel information. Building around HR-MSA, we design a new architecture called High-Rank Attention for Super Resolution (HiRA-SR). Specifically, We develop Focusing Block (FB) to divert local pixel information from the HR-MSA module and introduce Residual Multi-Head Contextual Block (RMCB) to integrate global information through non-local attention. Experiments demonstrate that our HR-MSA can replace MSA and achieve efficient and effective improvements across various state-of-the-art SR methods. With parameters and FLOPs similar to SwinIR-light, our HiRA-SR sets a new state-of-the-art for lightweight image super-resolution. Our code will be available at: <span><span>https://github.com/yyexplorerNB/HiRA-SR</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114637"},"PeriodicalIF":7.6,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CGC-GS: Cross geometric cues constrained Gaussian splatting CGC-GS：交叉几何线索约束高斯溅射

IF 7.6 1区计算机科学

Knowledge-Based Systems Pub Date : 2025-10-11 DOI: 10.1016/j.knosys.2025.114630

Zerui Yu , Zhidong Chen , Zhiheng Zhou , Hongkun Cao

{"title":"CGC-GS: Cross geometric cues constrained Gaussian splatting","authors":"Zerui Yu , Zhidong Chen , Zhiheng Zhou , Hongkun Cao","doi":"10.1016/j.knosys.2025.114630","DOIUrl":"10.1016/j.knosys.2025.114630","url":null,"abstract":"<div><div>The planarized Gaussian representation, such as 2DGS, has shown great potential for geometry reconstruction. However, due to the lack of accurate geometric cues to evaluate the topology results and provide immediate feedback to the optimizer, they all fail to reconstruct the detailed geometry while maintaining high quality RGB rendering. This paper introduces the cross geometric cues that mixes the proposed scale-invariant monocular depth, confidence map-controlled normal prior and multi-view regularization consists of projection and photometric consistency to form the crossed constrain and evaluation of local topology in each optimization iteration, which results in more detailed geometric representation and perspective consistency. Moreover, a global density control strategy is proposed to correct the split standard and promote the homogeneous distribution of Gaussians in the whole scene, which benefits the high-frequency extraction ability and the removal of inappropriately large Gaussians. In experiments, the proposed method outperforms the baseline on overfitting and three datasets and achieves competitive results compared to other state-of-the-art (SOTA) methods. The relevant code will be be published at <span><span>https://github.com/Zerui-Yu/CGC-GS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114630"},"PeriodicalIF":7.6,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PricoMS: Prior-coordinated multiscale synthesis network for self-supervised–aided vessel segmentation in intravascular ultrasound image amidst label scarcity 在标签稀缺的情况下，用于血管内超声图像自我监督辅助血管分割的先验协调多尺度合成网络

IF 7.6 1区计算机科学

Knowledge-Based Systems Pub Date : 2025-10-11 DOI: 10.1016/j.knosys.2025.114636

Xingru Huang , Huawei Wang , Shuaibin Chen , Shaowei Jiang , Retesh Bajaj , Nathan Angelo Lecaros Yap , Murat Cap , Xiaoshuai Zhang , Xingwei He , Anantharaman Ramasamy , Ryo Torii , Jouke Dijkstra , Huiyu Zhou , Christos V. Bourantas , Qianni Zhang

{"title":"PricoMS: Prior-coordinated multiscale synthesis network for self-supervised–aided vessel segmentation in intravascular ultrasound image amidst label scarcity","authors":"Xingru Huang , Huawei Wang , Shuaibin Chen , Shaowei Jiang , Retesh Bajaj , Nathan Angelo Lecaros Yap , Murat Cap , Xiaoshuai Zhang , Xingwei He , Anantharaman Ramasamy , Ryo Torii , Jouke Dijkstra , Huiyu Zhou , Christos V. Bourantas , Qianni Zhang","doi":"10.1016/j.knosys.2025.114636","DOIUrl":"10.1016/j.knosys.2025.114636","url":null,"abstract":"<div><div>Intravascular ultrasound (IVUS) imaging is invaluable in aiding diagnosis and intervention of coronary artery disease. However its use is limited because of the increased time needed to segment the IVUS images and accurately quantify plaque burden, and lesion severity. To overcome this limitation we present a prior-coordinated multiscale synthesis network (PricoMS) for segmenting IVUS images under the condition of label scarcity. This network integrates a prior coherence paradigm (PCP), which enhances structural synthesis by maintaining consistency across scales, and a hierarchical contextual synthesis (HCS) module, which facilitates the integration of contextual information for better spatial understanding. To address the challenge of label scarcity in IVUS data, a prior encoder repeatedly utilizes unlabeled IVUS images for training, providing prior features of the images for segmentation tasks. Additionally, this network employs an adaptive morphological fusion-contextual space encoding (AMF-CSE) module to capture multi-scale and contextual data, thereby bolstering the modelâ€ ™s capability to discern intricate vascular features even in challenging areas with suboptimal quality and imaging artifacts such as electronic noise, speckle noise, motion artifacts, and acoustic scattering. PricoMS exhibits robust performance, achieving a Dice score of 95.2% for detecting the lumen border and 84.0% for detecting the external elastic membrane (EEM) border, surpassing many existing techniques. The source code is publicly accessible at: <span><span>https://github.com/IMOP-lab/PricoMS-Pytorch</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114636"},"PeriodicalIF":7.6,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-task SAR image processing via GAN-based unsupervised manipulation 基于gan的无监督处理多任务SAR图像

IF 7.6 1区计算机科学

Knowledge-Based Systems Pub Date : 2025-10-11 DOI: 10.1016/j.knosys.2025.114644

Xuran Hu , Mingzhe Zhu , Ziqiang Xu , Zhenpeng Feng , Haitao Yang , Ljubiša Stanković

{"title":"Multi-task SAR image processing via GAN-based unsupervised manipulation","authors":"Xuran Hu , Mingzhe Zhu , Ziqiang Xu , Zhenpeng Feng , Haitao Yang , Ljubiša Stanković","doi":"10.1016/j.knosys.2025.114644","DOIUrl":"10.1016/j.knosys.2025.114644","url":null,"abstract":"<div><div>Generative Adversarial Networks (GANs) have shown tremendous potential in synthesizing realistic SAR images by learning patterns from data distribution. Some GANs can achieve image editing by introducing latent codes, demonstrating significant promise in SAR image processing. Compared to traditional SAR image processing methods, editing based on latent space is entirely unsupervised, allowing image processing to be conducted without any label. Additionally, the information extracted from the data is more interpretable. This paper proposes a novel SAR image processing framework called GAN-based Unsupervised Editing (GUE), aiming to address the following two issues: (1) disentangling semantic directions in GANs’ latent space and finding meaningful directions; (2) establishing a comprehensive SAR image processing framework. In the implementation of GUE, we decompose the entangled semantic directions in GANs’ latent space by training a carefully designed network. Moreover, it allows us to accomplish multiple SAR image processing tasks (including despeckling, auxiliary identification, and rotation editing) in a single training process without any form of supervision. Extensive experiments validate the effectiveness of our method.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114644"},"PeriodicalIF":7.6,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MSPCNF-Net: Multi-scale parallel cross-neighborhood fusion network for medical image segmentation MSPCNF-Net：用于医学图像分割的多尺度并行跨邻域融合网络

IF 7.6 1区计算机科学

Knowledge-Based Systems Pub Date : 2025-10-11 DOI: 10.1016/j.knosys.2025.114624

Yugen Yi , Yu Duan , Xuan Wu , Hong Li , Siwei Luo , Jiangyan Dai , Xinping Rao , Yirui Jiang , Wei Zhou

{"title":"MSPCNF-Net: Multi-scale parallel cross-neighborhood fusion network for medical image segmentation","authors":"Yugen Yi , Yu Duan , Xuan Wu , Hong Li , Siwei Luo , Jiangyan Dai , Xinping Rao , Yirui Jiang , Wei Zhou","doi":"10.1016/j.knosys.2025.114624","DOIUrl":"10.1016/j.knosys.2025.114624","url":null,"abstract":"<div><div>Transformer-based architectures have emerged to deal with inherent limitations of CNNs in catching long-range dependencies for image analysis tasks. However, these approaches generally struggle to process both global and local context information simultaneously. Therefore, the paper establishes a novel dual encoder-decoder framework termed <strong>M</strong>ulti-<strong>S</strong>cale <strong>P</strong>arallel <strong>C</strong>ross-<strong>N</strong>eighborhood <strong>F</strong>usion <strong>Net</strong>work (MSPCNF-Net). It develops a dual-branch network to leverage CNN and Transformer components for acquiring local and global features at multiple scales. For optimizing this feature fusion from these dual-branch encoders, two specialized modules are designed, including the Bidirectional Window Perception Attention (BWPA) module and the Bidirectional Cross Attention (BCA) module. In addition, a Neighborhood Spatial Attention (NSA) module incorporating Gumbel-softmax is implemented by proximal pixels, which facilitates the processing of fine-grained local information and emphasizes key features with lower computational demands. Experiments are performed on four datasets with three distinct tasks including abdominal organ, cardiac organ, and retinal vessel segmentation, which indicate that MSPCNF-Net attains superior effectiveness compared to current well-known methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114624"},"PeriodicalIF":7.6,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145324967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0