Knowledge-Based Systems最新文献

筛选
英文 中文
Mechanism-empowered multivariate time series forecasting model: application to tuberculosis prediction 机制赋能的多变量时间序列预测模型:在结核病预测中的应用
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-07-17 DOI: 10.1016/j.knosys.2025.114124
Danyu Li
{"title":"Mechanism-empowered multivariate time series forecasting model: application to tuberculosis prediction","authors":"Danyu Li","doi":"10.1016/j.knosys.2025.114124","DOIUrl":"10.1016/j.knosys.2025.114124","url":null,"abstract":"<div><div>Tuberculosis, a highly contagious chronic disease, remains a major global public health concern. Despite medical progress, current methods struggle with new challenges, including systematic and effective downscaling, accurate prediction of disease incidence, and implementation of source reduction measures, all of which have added to the difficulty of tuberculosis control. Given the limitations of the recently proposed eight models in predictive accuracy, this study employs a Learnable Decomposition and Dual Focus Module Model and then introduces a novel mechanism-supported multivariate spatiotemporal series framework, to address the challenges in tuberculosis prediction through an investigation of coal power generation in China. This framework substantially simplifies the complexity of tuberculosis prediction, enables more accurate dimensionality reduction, and improves traceability. It also enhances the interpretability power and accuracy of the model, applied in this study, in tuberculosis prediction. On the test set, the proposed framework achieved an R² of 0.906, an MSE of 0.079, and an MAE of 0.160, whereas the lowest baseline scores were an R² of -4397.777, an MSE of 0.177, and an MAE of 0.227 for DLinear. This study provides a novel perspective for enhancing epidemic forecasting, exploring source reduction measures for industrial activities, and demonstrating the feasibility of AI-assisted public health strategies and green production.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"327 ","pages":"Article 114124"},"PeriodicalIF":7.2,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144679858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image deraining via dual-level contextual information associated learning for autonomous driving 基于双层上下文信息关联学习的自动驾驶图像提取
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-07-17 DOI: 10.1016/j.knosys.2025.114086
Xiaofen Wang , Bin Yang , Tao Wen , Zhen Han , Zheng Wang
{"title":"Image deraining via dual-level contextual information associated learning for autonomous driving","authors":"Xiaofen Wang ,&nbsp;Bin Yang ,&nbsp;Tao Wen ,&nbsp;Zhen Han ,&nbsp;Zheng Wang","doi":"10.1016/j.knosys.2025.114086","DOIUrl":"10.1016/j.knosys.2025.114086","url":null,"abstract":"<div><div>In recent years, researchers have made progress in utilizing low-level contextual information to reduce the impact of rain on autonomous driving. However, objects within the same semantic category tend to share similar characteristics, aiding background recovery. To this end, we propose the Dual-Level Contextual Associated Learning Network (DCALNet), which integrates both low-level and semantic-level contextual information to exploit object similarities better. DCALNet employs a Coarse Background Recovery Module (CBRM) with a hybrid feature representation to predict a coarse derained image enriched with semantic context, effectively capturing spatial distributions of semantically similar objects, whether adjacent or non-adjacent. Furthermore, the Dual-Level Contextual Associated Learning Module (DCALM) further enhances background recovery by mining similarities in texture, structure, and color. To ensure semantic consistency, we apply both local and global semantic constraints, improving the accuracy of semantic information in the derained image. Experimental results show that DCALNet outperforms state-of-the-art methods by achieving an improvement of 1.33dB/0.0034 in PSNR/SSIM, and 0.56/0.42 % in mIoU/PA on average, demonstrating its effectiveness in both deraining and segmentation tasks.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"327 ","pages":"Article 114086"},"PeriodicalIF":7.2,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144672399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On deep CFR integrated with opponent model in imperfect information games 不完全信息博弈中与对手模型相结合的深度CFR
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-07-17 DOI: 10.1016/j.knosys.2025.114105
Jiao Wang , Yun Li , Shanshan Niu , Ziyang Wu
{"title":"On deep CFR integrated with opponent model in imperfect information games","authors":"Jiao Wang ,&nbsp;Yun Li ,&nbsp;Shanshan Niu ,&nbsp;Ziyang Wu","doi":"10.1016/j.knosys.2025.114105","DOIUrl":"10.1016/j.knosys.2025.114105","url":null,"abstract":"<div><div>Imperfect information games are widely present in computer networking. For such games, CFR families, employing Nash equilibrium theory, is a safe and effective approach, but huge conservative on profits also follow. Identification of opponents’ behavioral models and exploiting sub-optimal opponents are efficient methods to improve agents’ profits. Furthermore, the accuracy, timeliness, and generalization of the opponent model seriously affect the game results. Therefore, in this paper, a novel online implicit opponent modeling, based on a neural network, is introduced into Deep Counterfactual Regret Minimization, and the Opponent Modeling Deep Counterfactual Regret Minimization(ODCFR) with two structures is proposed. Moreover, different network training methods are provided for unknown and known opponents, which effectively balance high profits with security. The method is evaluated on the Leduc game, which shows that ODCFR has superior performance over Deep CFR.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"327 ","pages":"Article 114105"},"PeriodicalIF":7.2,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144711365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Seeded Poisson Factorization: leveraging domain knowledge to fit topic models 种子泊松分解:利用领域知识来拟合主题模型
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-07-17 DOI: 10.1016/j.knosys.2025.114116
Bernd Prostmaier , Jan Vávra , Bettina Grün , Paul Hofmarcher
{"title":"Seeded Poisson Factorization: leveraging domain knowledge to fit topic models","authors":"Bernd Prostmaier ,&nbsp;Jan Vávra ,&nbsp;Bettina Grün ,&nbsp;Paul Hofmarcher","doi":"10.1016/j.knosys.2025.114116","DOIUrl":"10.1016/j.knosys.2025.114116","url":null,"abstract":"<div><div>Topic models are widely used for discovering latent thematic structures in large text corpora, yet traditional unsupervised methods often struggle to align with pre-defined conceptual domains. This paper introduces seeded Poisson factorization (SPF), a novel approach that extends the Poisson factorization (PF) framework by incorporating domain knowledge through seed words. SPF enables a structured topic discovery by modifying the prior distribution of topic-specific term intensities, assigning higher initial rates to pre-defined seed words. The model is estimated using variational inference with stochastic gradient optimization, ensuring scalability to large datasets.</div><div>We present in detail the results of applying SPF to an Amazon customer feedback dataset, leveraging pre-defined product categories as guiding structures. SPF achieves superior performance compared to alternative guided probabilistic topic models in terms of computational efficiency and classification performance. Robustness checks highlight SPF’s ability to adaptively balance domain knowledge and data-driven topic discovery, even in case of imperfect seed word selection. Further applications of SPF to four additional benchmark datasets, where the corpus varies in size and the number of topics differs, demonstrate its general superior classification performance compared to the unseeded PF model.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"327 ","pages":"Article 114116"},"PeriodicalIF":7.2,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144702795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PerioDformer: periodic disposition enhanced transformer for times series forecasting 用于时间序列预测的周期配置增强变压器
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-07-16 DOI: 10.1016/j.knosys.2025.114088
Wenjie Tang, Yilei Xiao, Yizhi Zhou, Heng Qi
{"title":"PerioDformer: periodic disposition enhanced transformer for times series forecasting","authors":"Wenjie Tang,&nbsp;Yilei Xiao,&nbsp;Yizhi Zhou,&nbsp;Heng Qi","doi":"10.1016/j.knosys.2025.114088","DOIUrl":"10.1016/j.knosys.2025.114088","url":null,"abstract":"<div><div>Due to the permutation-invariant nature of the self-attention mechanism and the absence of inherent semantic meaning in individual time points, temporal information is inevitably lost, impairing the Transformer’s ability to capture temporal dependencies. However, when sequence elements are semantically rich, the impact of losing temporal order tends to be less severe. Motivated by this observation, we propose the Periodic Disposition Enhanced Transformer (PerioDformer) for time series forecasting, which transforms input sequences into semantically enriched tokens. In the proposed Periodic Disposition strategy, the input sequence is segmented into periodic blocks according to a predefined period length. These blocks are then reorganized into two complementary structures to form phase-wise and period-wise tokens, which are fed into two separate encoders. Each phase-wise token aggregates time points from the same phase across multiple periods, capturing cross-period temporal patterns at that specific phase position. In contrast, each period-wise token encapsulates an entire periodic block, preserving the complete intra-period dynamics. This periodic disposition greatly reduces the number of tokens fed into the Transformer, allowing for a significantly longer look-back window with only a marginal increase in memory consumption and computational complexity. Empirical results demonstrate that PerioDformer achieves state-of-the-art performance on six challenging real-world datasets. The source code is available at: <span><span>https://github.com/wenjietang218/PerioDformer</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"327 ","pages":"Article 114088"},"PeriodicalIF":7.2,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144696724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SpaMGAN: Multi-view graph augmentation network for spatial domain identification in spatial transcriptomics SpaMGAN:用于空间转录组学空间域识别的多视图图增强网络
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-07-16 DOI: 10.1016/j.knosys.2025.114100
Hao Liu , Yue Gao , Cui-Na Jiao , Chun-Hou Zheng , Ying-Lian Gao , Jin-Xing Liu , Yan-Li Wang
{"title":"SpaMGAN: Multi-view graph augmentation network for spatial domain identification in spatial transcriptomics","authors":"Hao Liu ,&nbsp;Yue Gao ,&nbsp;Cui-Na Jiao ,&nbsp;Chun-Hou Zheng ,&nbsp;Ying-Lian Gao ,&nbsp;Jin-Xing Liu ,&nbsp;Yan-Li Wang","doi":"10.1016/j.knosys.2025.114100","DOIUrl":"10.1016/j.knosys.2025.114100","url":null,"abstract":"<div><div>Recent advancements in spatial transcriptomics have made it possible to profile gene expression while maintaining the spatial organization of cells, opening new avenues for exploring tissue microenvironments. However, integrating spatial and gene expression data to accurately identify spatial domains remains challenging. In this study, we present SpaMGAN as a multi-view graph augmentation network for spatial domain identification in spatial transcriptomics. The model constructs a spatial neighborhood graph by combining spot spatial proximity with cosine-weighted gene expression similarity. A pre-clustering pruning strategy generates a cell-type-aware K-nearest neighbor graph to better capture spatial similarity at domain boundaries. These graphs are merged into a weighted adjacency matrix. To enhance robustness and generalization, SpaMGAN incorporates adjacency matrix weighting, node shuffling, and feature masking. Using a consistency-based contrastive strategy, multiple augmented graph views are processed through graph convolution layers, and feature representations are fused via an attention mechanism. Evaluated on four datasets from three spatial transcriptomics platforms, SpaMGAN outperforms eight advanced methods. Specifically, the algorithm achieved the highest adjusted rand index scores of 0.594 and 0.585 on the datasets of the human dorsolateral prefrontal cortex and mouse visual cortex, respectively. In breast cancer tissue, SpaMGAN effectively reveals spatial heterogeneity, offering insights into the tumor microenvironment. On large-scale datasets such as mouse embryos, it identifies major anatomical regions and uncovers biologically meaningful domains enriched in developmental processes. Overall, SpaMGAN demonstrates strong scalability and biological interpretability, making it a powerful tool for analyzing tissue structure and disease mechanisms in spatial transcriptomics.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"327 ","pages":"Article 114100"},"PeriodicalIF":7.2,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144672194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatiotemporal isomorphic cross-brain region interaction network for cross-subject EEG emotion recognition 跨主体EEG情绪识别的时空同构跨脑区交互网络
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-07-16 DOI: 10.1016/j.knosys.2025.114115
Yanling An , Shaohai Hu , Shuaiqi Liu , Zhihui Gu , Yuan Zhang , Yudong Zhang
{"title":"Spatiotemporal isomorphic cross-brain region interaction network for cross-subject EEG emotion recognition","authors":"Yanling An ,&nbsp;Shaohai Hu ,&nbsp;Shuaiqi Liu ,&nbsp;Zhihui Gu ,&nbsp;Yuan Zhang ,&nbsp;Yudong Zhang","doi":"10.1016/j.knosys.2025.114115","DOIUrl":"10.1016/j.knosys.2025.114115","url":null,"abstract":"<div><div>Electroencephalogram (EEG) has high temporal resolution and low cost and has become one of the important tools for emotion recognition in human-computer interaction. The intricate architecture and functioning of the brain, along with substantial individual variances among participants, and existing methods are difficult to simultaneously model the temporal and spatial consistency of brain area interactions and EEG signals between subjects, which limits the generalization performance of the model in cross-subject contexts. To meet this challenge, we propose a cross-subject EEG emotion recognition model based on a spatiotemporal isomorphic cross-brain region interaction network (STCBI-Nets). In this model, we first designed the cross-brain region interaction module (CBI), which dynamically models the interaction relationship between different brain regions through a multi-head cross-attention mechanism, captures heterogeneous information flow between local brain regions, enhances the long-range dependency modeling ability of EEG time series, and effectively integrates the collaborative activation mode of the whole brain. Secondly, we design a spatiotemporal isomorphic adaptive fusion (STIAF) block, which adopts a dual branch structure to mine hierarchical and complementary information of spatiotemporal features and introduces a negative sample weighted contrastive learning mechanism and dynamic fusion strategy to improve the robustness and discriminative power of cross-view shared representations, thereby enhancing the model's adaptability to different subject features. Finally, we propose a joint optimized adaptive domain alignment strategy (JOADAS), which combines global adversarial learning with an adaptive class center alignment mechanism to reduce domain bias between different subjects from both macro and micro levels, enhance intra-class aggregation and inter-class separability, and improve the model's discriminative performance and cross-subject generalization ability. Extensive experiments on multiple datasets demonstrated the superior performance of the proposed algorithm, and STCBI-Nets outperform state-of-the-art (SOTA) methods and exhibit stronger generalization ability and stability in cross-subject EEG emotion recognition tasks.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"327 ","pages":"Article 114115"},"PeriodicalIF":7.2,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144672398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient RGB-D scene understanding via multi-task adaptive learning and cross-dimensional feature guidance 通过多任务自适应学习和跨维度特征引导实现RGB-D场景的高效理解
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-07-16 DOI: 10.1016/j.knosys.2025.114107
Guodong Sun , Junjie Liu , Gaoyang Zhang , Bo Wu , Yang Zhang
{"title":"Efficient RGB-D scene understanding via multi-task adaptive learning and cross-dimensional feature guidance","authors":"Guodong Sun ,&nbsp;Junjie Liu ,&nbsp;Gaoyang Zhang ,&nbsp;Bo Wu ,&nbsp;Yang Zhang","doi":"10.1016/j.knosys.2025.114107","DOIUrl":"10.1016/j.knosys.2025.114107","url":null,"abstract":"<div><div>Scene understanding plays a critical role in enabling intelligence and autonomy in robotic systems. Traditional approaches often face challenges, including occlusions, ambiguous boundaries, and the inability to adapt attention based on task-specific requirements and sample variations. To address these limitations, this paper presents an efficient RGB-D scene understanding model that performs a range of tasks, including semantic segmentation, instance segmentation, orientation estimation, panoptic segmentation, and scene classification. The proposed model incorporates an enhanced fusion encoder, which effectively leverages redundant information from both RGB and depth inputs. For semantic segmentation, we introduce normalized focus channel layers and a context feature interaction layer, designed to mitigate issues such as shallow feature misguidance and insufficient local-global feature representation. The instance segmentation task benefits from a non-bottleneck 1D structure, which achieves superior contour representation with fewer parameters. Additionally, we propose a multi-task adaptive loss function that dynamically adjusts the learning strategy for different tasks based on scene variations. Extensive experiments on the NYUv2, SUN RGB-D, and Cityscapes datasets demonstrate that our approach outperforms existing methods in both segmentation accuracy and processing speed.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"327 ","pages":"Article 114107"},"PeriodicalIF":7.2,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144687463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DMRL: A distributed multi-agent reinforcement learning algorithm for imbalanced classification DMRL:一种用于不平衡分类的分布式多智能体强化学习算法
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-07-16 DOI: 10.1016/j.knosys.2025.114101
Yixin Ji , Chao Jing
{"title":"DMRL: A distributed multi-agent reinforcement learning algorithm for imbalanced classification","authors":"Yixin Ji ,&nbsp;Chao Jing","doi":"10.1016/j.knosys.2025.114101","DOIUrl":"10.1016/j.knosys.2025.114101","url":null,"abstract":"<div><div>Traditional imbalanced classification methods rely on sampling or allocating different weights to different classes to improve the recognition rate for minority classes. However, these methods ignore the importance of adaptability, particularly as the degree of imbalance increases, resulting in significant limitations when dynamically selecting the optimal classification strategy. To tackle this issue, we propose a distributed multi-agent reinforcement learning (DMRL) method for imbalanced classification problems, which models the classification problem as a multi-agent Markov decision-making process within the distributed computing scheme. Subsequently, there are three important schemes implemented in DMRL: 1) A multi-agent classification scheme based on improved double deep Q-network (MCSQ) that dynamically optimizes the imbalanced classification strategy through the reward function and importance weights; 2) A prioritized experience replay-based scheme for sampling agents experience (PERS) that uses prioritized experience replay to learn from important samples; 3) A distributed computing scheme based on the multi-agent centralized training and decentralized execution (CTDE) paradigm (DCSM) that combines distributed computing with multi-agent CTDE to improve learning efficiency. Finally, we perform our experiment on public datasets such as IMDB, Cifar-10, Fashion-Mnist, and Mnist using the various imbalanced ratios. Experimental results demonstrate that DMRL outperforms eight representative methods, with a maximum improvement of 8.9 % in G-mean and 10.7 % in F-measure when compared to the suboptimal method. Simultaneously, we study the impact of the value of the reward function, the number of agents and servers, and the effectiveness of prioritized experience replay on the performance of DMRL.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"327 ","pages":"Article 114101"},"PeriodicalIF":7.2,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144696725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A natural language processing-based approach for early detection of heart failure onset using electronic health records 一种基于自然语言处理的方法,用于使用电子健康记录早期检测心力衰竭发作
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-07-16 DOI: 10.1016/j.knosys.2025.114102
Yuxi Liu , Zhen Tan , Zhenhao Zhang , Song Wang , Jingchuan Guo , Huan Liu , Tianlong Chen , Jiang Bian
{"title":"A natural language processing-based approach for early detection of heart failure onset using electronic health records","authors":"Yuxi Liu ,&nbsp;Zhen Tan ,&nbsp;Zhenhao Zhang ,&nbsp;Song Wang ,&nbsp;Jingchuan Guo ,&nbsp;Huan Liu ,&nbsp;Tianlong Chen ,&nbsp;Jiang Bian","doi":"10.1016/j.knosys.2025.114102","DOIUrl":"10.1016/j.knosys.2025.114102","url":null,"abstract":"<div><div><em>Objective</em>This study set out to develop and validate a risk prediction tool for the early detection of heart failure (HF) onset using real-world EHR data. <em>Background</em>While existing HF risk assessment models have shown promise in clinical settings, they are often tailored to specific medical conditions, limiting their generalizability. Moreover, most methods rely on hand-crafted features, making it difficult to capture the high-dimensional, sparse, and temporal nature of EHR data, thus reducing their predictive accuracy. <em>Methods</em> A total of 2561 HF and 5493 matched control patients were identified from the OneFlorida+ Clinical Research Consortium. We employed a suite of natural language processing (NLP) models, including Bag of Words, Skip-gram, and ClinicalBERT, to generate EHR embeddings, which were used as inputs for five prediction models. Subgroup analyses were conducted across gender, race, and ethnicity. Model calibration was assessed under three calibration scenarios: no recalibration, recalibration in the large, and logistic recalibration. <em>Results</em>The XGBoost model demonstrated the best overall performance, achieving an AUROC of 0.7672, an F1 score of 0.5547, an AUPRC of 0.6382, and a Matthews correlation coefficient of 0.3993. The most impactful predictors included diagnoses, procedures, medications, lab tests, and patient age. Model performance varied across gender, race, and ethnicity subgroups. Logistic recalibration significantly improved model calibration in the overall cohort and demographic subgroups. <em>Conclusion</em>Our NLP-based approach demonstrated strong predictive performance and practical relevance, highlighting its potential for integration into real-world clinical applications to facilitate the early detection and proactive management of individuals at risk for HF.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"327 ","pages":"Article 114102"},"PeriodicalIF":7.2,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144679857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信