Weiguang Wang , Lijuan Ma , Wei Cai , Haiyan Zhao , Xia Zhang
{"title":"HMEA: A hierarchical medical knowledge graph entity alignment model fusing multi-aspect information","authors":"Weiguang Wang , Lijuan Ma , Wei Cai , Haiyan Zhao , Xia Zhang","doi":"10.1016/j.artmed.2025.103188","DOIUrl":"10.1016/j.artmed.2025.103188","url":null,"abstract":"<div><div>Medical entity alignment is crucial for the integration and reasoning of medical knowledge, aiming to match semantically equivalent entities across different medical knowledge graphs. Unlike entities in general knowledge graphs, medical entities contain rich multi-aspect information, which not only includes structural and attribute information but also additional information such as ontology and descriptions. However, existing entity alignment methods overlook these additional pieces of information and lack exploration into the fusion of multi-aspect information. This leads to less-than-ideal performance in medical entity alignment. To address the aforementioned issues, in this paper, we propose a hierarchical medical knowledge graph entity alignment method, termed HMEA, which integrates multi-aspect information. Firstly, we represent the medical knowledge graph as a hierarchical heterogeneous graph to model the multi-aspect information of medical entities. Secondly, we design different representation learning methods according to the characteristics of multi-aspect information to obtain vector representations of entities in different dimensions. Subsequently, we devise a two-stage multi-aspect knowledge fusion mechanism to dynamically integrate multi-aspect information, enabling mutual complementarity. Finally, we utilize the fused entity vector representations to guide entity alignment. We compare our approach with state-of-the-art baseline models on ten different types of publicly available datasets and further conduct ablation and parameter analyses. Experimental results validate the effectiveness and robustness of the proposed model. In benchmark tests across all datasets, HMEA outperforms the current state-of-the-art methods significantly.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"168 ","pages":"Article 103188"},"PeriodicalIF":6.1,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144633097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaobo Li , Yijia Zhang , Xiaodi Hou , Shilong Wang , Hongfei Lin
{"title":"Deep learning for automatic ICD coding: Review, opportunities and challenges","authors":"Xiaobo Li , Yijia Zhang , Xiaodi Hou , Shilong Wang , Hongfei Lin","doi":"10.1016/j.artmed.2025.103187","DOIUrl":"10.1016/j.artmed.2025.103187","url":null,"abstract":"<div><h3>Background:</h3><div>The automatic International Classification of Diseases (ICD) coding task assigns unique medical codes to diseases in clinical texts for further data statistics, quality control, billing and other tasks. The efficiency and accuracy of medical code assignment is a significant challenge affecting healthcare. However, in clinical practice, Electronic Health Records (EHRs) data are usually complex, heterogeneous, non-standard and unstructured, and the manual coding process is time-consuming, laborious and error-prone. Traditional machine learning methods struggle to extract significant semantic information from clinical texts accurately, but the latest progress in Deep Learning (DL) has shown promising results to address these issues.</div></div><div><h3>Objective:</h3><div>This paper comprehensively reviewed recent advancements in utilizing deep learning for automatic ICD coding, which aimed to reveal prominent challenges and emerging development trends by summarizing and analyzing the model’s year, design motivation, deep neural networks, and auxiliary data.</div></div><div><h3>Methods:</h3><div>This review introduced systematic literature on automatic ICD coding methods based on deep learning. We screened 5 online databases, including Web of Science, SpringerLink, PubMed, ACM, and IEEE digital library, and collected 53 published articles related to deep learning-based ICD coding from 2017 to 2023.</div></div><div><h3>Results:</h3><div>These deep neural network methods aimed to overcome some challenges, such as lengthy and noisy clinical text, high dimensionality and functional relationships of medical codes, and long-tail label distribution. The Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), attention mechanisms, Transformers, Pre-trained Language Models (PLMs), etc, have become popular to address prominent issues in ICD coding. Meanwhile, introducing medical ontology within the ICD coding system (code description and code hierarchy) and external knowledge (Wikipedia articles, tabular data, Clinical Classification Software (CCS), fine-tuning PLMs based on biomedical corpus, entity recognition and concept extraction) has become an emerging trend for automatic ICD coding.</div></div><div><h3>Conclusion:</h3><div>This paper provided a comprehensive review of recent literature on applying deep learning technology to improve medical code assignment from a unique perspective. Multiple neural network methods (CNNs, RNNs, Transformers, PLMs, especially attention mechanisms) have been successfully applied in ICD tasks and achieved excellent performance. Various medical auxiliary data has also proven valuable in enhancing model feature representation and classification performance. Our in-depth and systematic analysis suggested that the automatic ICD coding method based on deep learning has a bright future in healthcare. Finally, we discussed some major challenges and outlined future development directio","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"168 ","pages":"Article 103187"},"PeriodicalIF":6.1,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144614723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aliaksandra Sikirzhytskaya , Ilya Tyagin , S. Scott Sutton , Michael D. Wyatt , Ilya Safro , Michael Shtutman
{"title":"AI-based mining of biomedical literature: Applications for drug repurposing for the treatment of dementia","authors":"Aliaksandra Sikirzhytskaya , Ilya Tyagin , S. Scott Sutton , Michael D. Wyatt , Ilya Safro , Michael Shtutman","doi":"10.1016/j.artmed.2025.103218","DOIUrl":"10.1016/j.artmed.2025.103218","url":null,"abstract":"<div><div>Neurodegenerative diseases like Alzheimer's, Parkinson's, and HIV-associated neurocognitive disorder severely impact patients and healthcare systems. While effective treatments remain limited, researchers are actively developing ways to slow progression and improve patient outcomes, requiring innovative approaches to handle huge volumes of new scientific data. To enable the automatic analysis of biomedical data we introduced AGATHA, an effective AI-based literature mining tool that can navigate massive scientific literature databases. The overarching goal of this effort is to adapt AGATHA for drug repurposing by revealing hidden connections between FDA-approved medications and a health condition of interest. Our tool converts the abstracts of peer-reviewed papers from PubMed into multidimensional space where each gene and health condition are represented by specific metrics. We implemented advanced statistical analysis to reveal distinct clusters of scientific terms within the virtual space created using AGATHA-calculated parameters for selected health conditions and genes. Partial Least Squares Discriminant Analysis was employed for categorizing and predicting samples (122 diseases and 20,889 genes) fitted to specific classes. Advanced statistics were employed to build a discrimination model and extract lists of genes specific to each disease class. We focused on repurposing drugs for dementia by identifying dementia-associated genes highly ranked in other disease classes. The method was developed for detection of genes that shared across multiple conditions and classified them based on their roles in biological pathways. This led to the selection of six primary drugs for further study.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"168 ","pages":"Article 103218"},"PeriodicalIF":6.1,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144654111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weidun Xie , Xingjian Chen , Lei Huang , Zetian Zheng , Yuchen Wang , Ruoxuan Zhang , Xiao Zhang , Zhichao Liu , Chengbin Peng , Monika Gullerova , Ka-chun Wong
{"title":"DDintensity: Addressing imbalanced drug-drug interaction risk levels using pre-trained deep learning model embeddings","authors":"Weidun Xie , Xingjian Chen , Lei Huang , Zetian Zheng , Yuchen Wang , Ruoxuan Zhang , Xiao Zhang , Zhichao Liu , Chengbin Peng , Monika Gullerova , Ka-chun Wong","doi":"10.1016/j.artmed.2025.103202","DOIUrl":"10.1016/j.artmed.2025.103202","url":null,"abstract":"<div><div>Imbalanced datasets have been a persistent challenge in bioinformatics, particularly in the context of drug-drug interaction (DDI) risk level datasets. Such imbalance can lead to biased models that perform poorly on underrepresented classes. To address this issue, one strategy is to construct a balanced dataset, while another involves employing more advanced features and models. In this study, we introduce a novel approach called DDintensity, which leverages pre-trained deep learning models as embedding generators combined with LSTM-attention models to address the imbalance in DDI risk level datasets. We tested embeddings from various domains, including images, graphs, and textual corpus. Among these, embeddings generated by BioGPT achieved the highest performance, with an Area Under the Curve (AUC) of 0.97 and an Area Under the Precision-Recall curve (AUPR) of 0.92. Our model was trained on the DDinter and further validated using the MecDDI dataset. Additionally, case studies on chemotherapeutic drugs, DB00398 (Sorafenib) and DB01204 (Mitoxantrone) used in oncology, were conducted to demonstrate the specificity and effectiveness of the this methods. Our approach demonstrates high scalability across DDI modalities, as well as the discovery of novel interactions. In summary, we introduce DDIntensity as a solution for imbalanced datasets in bioinformatics with pre-trained deep-learning embeddings.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"168 ","pages":"Article 103202"},"PeriodicalIF":6.1,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144549994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahmad Berjaoui , Eduardo Hugo Sanchez , Louis Roussel , Elizabeth Cohen-Jonathan Moyal
{"title":"Uncovering the genetic basis of glioblastoma heterogeneity through multimodal analysis of whole slide images and RNA sequencing data","authors":"Ahmad Berjaoui , Eduardo Hugo Sanchez , Louis Roussel , Elizabeth Cohen-Jonathan Moyal","doi":"10.1016/j.artmed.2025.103191","DOIUrl":"10.1016/j.artmed.2025.103191","url":null,"abstract":"<div><div>Glioblastoma is a highly aggressive form of brain cancer characterized by rapid progression and poor prognosis. Despite advances in treatment, the underlying genetic mechanisms driving this aggressiveness remain poorly understood. In this study, we employed multimodal deep learning approaches to investigate glioblastoma heterogeneity using joint image/RNA-seq analysis. Our results reveal novel genes associated with glioblastoma. By leveraging a combination of whole-slide images and RNA-seq, as well as introducing novel methods to encode RNA-seq data, we identified specific genetic profiles that may explain different patterns of glioblastoma progression. These findings provide new insights into the genetic mechanisms underlying glioblastoma heterogeneity and highlight potential targets for therapeutic intervention. Code and data downloading instructions are available at: <span><span>https://github.com/ma3oun/gbheterogeneity</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"168 ","pages":"Article 103191"},"PeriodicalIF":6.1,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144563602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antonio Lopez-Martinez-Carrasco , Hugo M. Proença , Jose M. Juarez , Matthijs van Leeuwen , Manuel Campos
{"title":"Discovering multiple antibiotic resistance phenotypes using diverse top-k subgroup list discovery","authors":"Antonio Lopez-Martinez-Carrasco , Hugo M. Proença , Jose M. Juarez , Matthijs van Leeuwen , Manuel Campos","doi":"10.1016/j.artmed.2025.103200","DOIUrl":"10.1016/j.artmed.2025.103200","url":null,"abstract":"<div><div>Antibiotic resistance is one of the major global threats to human health and occurs when antibiotics lose their ability to combat bacterial infections. In this problem, a clinical decision support system could use phenotypes in order to alert clinicians of the emergence of patterns of antibiotic resistance in patients. Patient phenotyping is the task of finding a set of patient characteristics related to a specific medical problem such as the one described in this work. However, a single explanation of a medical phenomenon might be useless in the eyes of a clinical expert and be discarded. The discovery of multiple patient phenotypes for the same medical phenomenon would be useful in such cases. Therefore, in this work, we define the problem of mining diverse top-k phenotypes and propose the EDSLM algorithm, which is based on the Subgroup Discovery technique, the subgroup list model, and the Minimum Description Length principle. Our proposal provides clinicians with a method with which to obtain multiple and diverse phenotypes of a set of patients. We show a real use case of phenotyping in antimicrobial resistance using the well-known MIMIC-III dataset.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"167 ","pages":"Article 103200"},"PeriodicalIF":6.1,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144514073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ContraDTI: Improved drug–target interaction prediction via multi-view contrastive learning","authors":"Zhirui Liao , Lei Xie , Shanfeng Zhu","doi":"10.1016/j.artmed.2025.103195","DOIUrl":"10.1016/j.artmed.2025.103195","url":null,"abstract":"<div><div>Drug–target interaction (DTI) identification is one of the crucial issues in the field of drug discovery. Machine learning approaches offer efficient ways to address this issue, reducing expensive and time-consuming laboratory experiments. However, the scarcity of annotated drug data with labels restricts supervised machine learning applications to DTI prediction. Drawing inspiration from recent advances in contrastive learning, we present ContraDTI—a novel framework that adopts multi-view contrastive learning to overcome data limitations in this paper. Our model considers the molecular graph of a drug as the main view and the SMILES string of a drug as the side view, employing two types of loss functions for the contrast of the main view and the cross-view alignment between the main and the side views. Extensive experiments on both single-target and multi-target DTI datasets demonstrate that ContraDTI enhances the classification performance of DTI prediction, particularly when labeled data is scarce. ContraDTI can be a powerful tool for DTI prediction in data-limited scenarios. The code of this paper is available at <span><span>https://github.com/zhiruiliao/ContraDTI</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"168 ","pages":"Article 103195"},"PeriodicalIF":6.1,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144534645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yazhou Zhu , Minxian Li , Qiaolin Ye , Shidong Wang , Tong Xin , Haofeng Zhang
{"title":"RobustEMD: Domain robust matching for cross-domain few-shot medical image segmentation","authors":"Yazhou Zhu , Minxian Li , Qiaolin Ye , Shidong Wang , Tong Xin , Haofeng Zhang","doi":"10.1016/j.artmed.2025.103197","DOIUrl":"10.1016/j.artmed.2025.103197","url":null,"abstract":"<div><div>Few-shot medical image segmentation (FSMIS) aims to perform the limited annotated data learning in the medical image analysis scope. Despite the progress has been achieved, current FSMIS models are all trained and deployed on the same data domain, as is not consistent with the clinical reality that medical imaging data is always across different data domains (e.g. imaging modalities, institutions and equipment sequences). In this paper, we introduce Cross-domain Few-shot Medical Image Segmentation (CD-FSMIS) and propose a RobustEMD matching mechanism based on Earth Mover’s Distance (EMD) to enhance cross-domain generalization. Our approach includes three key components: (1) a channel-wise feature decomposition strategy that uniformly divides support and query features into local nodes, (2) a texture structure aware weights generation method that restrains domain-relevant nodes through Sobel-based gradient calculation, and (3) a boundary-aware Hausdorff distance measurement for transportation cost calculation. Extensive experiments across three scenarios (cross-modal, cross-sequence and cross-institution) show that our method significantly outperforms existing approaches. And ablation studies further confirm that each component of our RobustEMD mechanism contributes to the enhanced performance. The experimental outcomes highlight strong generalization capabilities of our model in real-world heterogeneous medical imaging environments. Code is available at <span><span>https://github.com/YazhouZhu19/RobustEMD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"167 ","pages":"Article 103197"},"PeriodicalIF":6.1,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GraphCF: Drug–target interaction prediction via multi-feature fusion with contrastive graph neural network","authors":"Dianlei Gao, Fei Zhu","doi":"10.1016/j.artmed.2025.103196","DOIUrl":"10.1016/j.artmed.2025.103196","url":null,"abstract":"<div><div>Drug–target interaction (DTI) is paramount in drug discovery and repurposing, which involves screening for effective candidate drugs by targeting specific proteins. Existing methods often focus on one or two representations of drugs or targets, and little has been explored regarding 3D structures. Moreover, how to capture interactions between multi-modal features comprehensively is also a key issue. A multi-modal interaction fusion method called GraphCF is proposed to overcome these limitations. Specifically, GraphCF uses a MixHop aggregator to gather higher-order neighborhood information between nodes in the DTI topological network and incorporate graph contrastive learning to capture more discriminative 2D representations of drugs and targets. Additionally, GraphCF utilizes convolutional neural networks and graph neural networks to extract the sequence and 3D structural features of drugs and targets, respectively. Then, GraphCF employs a cross-attention-based multi-feature fusion module to facilitate information interaction and fusion among multi-modal feature representations. GraphCF is evaluated and compared with some advanced methods on four public datasets, and the results demonstrate the competitive performance of GraphCF in DTI prediction.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"167 ","pages":"Article 103196"},"PeriodicalIF":6.1,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144518568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}