Shahid Mohammad Ganie , Pijush Kanti Dutta Pramanik
{"title":"Interpretable lung cancer risk prediction using ensemble learning and XAI based on lifestyle and demographic data","authors":"Shahid Mohammad Ganie , Pijush Kanti Dutta Pramanik","doi":"10.1016/j.compbiolchem.2025.108438","DOIUrl":"10.1016/j.compbiolchem.2025.108438","url":null,"abstract":"<div><div>Lung cancer is a leading cause of cancer-related death worldwide. The early and accurate detection of lung cancer is crucial for improving patient outcomes. Traditional predictive models often lack the accuracy and interpretability required in clinical settings. This study aims to enhance lung cancer prediction accuracy using ensemble learning methods while integrating explainable AI (XAI) techniques to ensure model interpretability. Advanced ensemble learning techniques, such as Voting and Stacking, have been implemented to improve the predictive accuracy compared to traditional models. The models are implemented on three real lung cancer datasets, comprising lifestyle data of the patients, and assessed using various performance metrics, highlighting their reliability in clinical diagnosis. XAI methods are incorporated to ensure the models are interpretable, fostering trust among clinicians. SHAP (SHapley Additive exPlanations) values are utilized to identify and prioritize clinical and demographic factors influencing risk predictions. The ensemble models demonstrate superior performance metrics, significantly improving lung cancer prediction accuracy. Specifically, the Stacking ensemble model achieves the average prediction accuracy of 99.59 %, precision of 100 %, recall of 97.64 %, F1-score 98.65 %, AUC of 100 %, Kappa 98.40 %, and MCC of 98.44 % across three datasets. We employed the Friedman aligned ranks test and Holm post hoc analysis to validate performance, showing that the Stacking ensemble consistently outperformed others with higher accuracy and reliable predictions. Feature importance analysis reveals critical risk factors, providing insights into their interconnectivity and enhancing risk assessment frameworks. Integrating XAI techniques ensures the models are interpretable, promoting their potential adoption in clinical practices. The findings support the development of targeted interventions and effective risk management strategies, aiming to improve patient outcomes in lung cancer diagnosis and treatment.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"117 ","pages":"Article 108438"},"PeriodicalIF":2.6,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143746620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PayloadGenX, a multi-stage hybrid virtual screening approach for payload design: A microtubule inhibitor case study","authors":"Faheem Ahmed , Anupama Samantasinghar , Naina Sunildutt , Kyung Hyun Choi","doi":"10.1016/j.compbiolchem.2025.108439","DOIUrl":"10.1016/j.compbiolchem.2025.108439","url":null,"abstract":"<div><div>Due to the rapid emergence of treatment-resistant cancers, there is a growing need to discover new anticancer therapies. Antibody-drug conjugates (ADCs) are aimed at solving this problem by specifically targeting and delivering cytotoxic payloads directly to cancer cells, thereby minimizing damage to healthy cells and enhancing treatment efficacy. Therefore, it is highly important to find an effective cytotoxic payload to ensure maximum therapeutic benefit and overcome cancer resistance. To address this challenge, we have developed a multi-stage hybrid virtual screening (VS) approach for payload design. We collected approximately 900 million molecules from databases such as ZINC12, ChEMBL, PubChem, and QM9. Additionally, 220 approved small molecule anticancer drugs were collected. Initially, these molecules were screened based on the Lipinski Rule of Five (RO5) criteria, resulting in 20 million molecules that met the drug-like properties criteria. Subsequently, fragments being key factor in this approach were generated from approved small molecule cancer drugs. This fragment-based screening approach resulted in identifying 6500, 36770, and 150,000 anticancer-like drugs with a similarity threshold greater than 0.6, 0.5, and 0.4. Similarity threshold when increased near to 1 bears better chance of discovering cancer like drugs. Further molecular docking of these anticancer-like drugs with β-tubulin resulted in identifying the top 1000 ranked drugs as microtubule inhibitors. ADMET analysis and synthetic validation followed by cell cytotoxicity further helps in shortlisting the 5 most effective payloads for further confirmation in preclinical setting. Additionally, molecular dynamics simulation was performed to confirm the structural stability and conformational dynamics of the Beta-tubulin-ligand complexes over a 100 ns simulation. In conclusion, this study effectively utilizes extensive compound databases and multi-stage screening methods to identify potent payloads, demonstrating promising advancements in discovering effective anticancer therapies.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"117 ","pages":"Article 108439"},"PeriodicalIF":2.6,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143740009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An insight into in silico strategies used for exploration of medicinal utility and toxicology of nanomaterials","authors":"Tahmeena Khan","doi":"10.1016/j.compbiolchem.2025.108435","DOIUrl":"10.1016/j.compbiolchem.2025.108435","url":null,"abstract":"<div><div>Nanomaterials (NMs) and the exploration of their comprehensive uses is an emerging research area of interest. They have improved physicochemical and biological properties and diverse functionality owing to their unique shape and size and therefore they are being explored for their enormous uses, particularly as medicinal and therapeutic agents. Nanoparticles (NPs) including metal and metal oxide-based NPs have received substantial consideration because of their biological applications. Computer-aided drug design (CADD) involving different strategies like homology modelling, molecular docking, virtual screening (VS), quantitative structure-activity relationship (QSAR) etc. and virtual screening hold significant importance in CADD used for lead identification and target identification. Despite holding importance, there are very few computational studies undertaken so far to explore their binding to the target proteins and macromolecules. Although the structural properties of nanomaterials are well documented, it is worthwhile to know how they interact with the target proteins making it a pragmatic issue for comprehension. This review discusses some important computational strategies like molecular docking and simulation, Nano-QSAR, quantum chemical calculations based on Density functional Theory (DFT) and computational nanotoxicology. Nano-QSAR modelling, based on semiempirical calculations and computational simulation can be useful for biomedical applications, whereas the DFT calculations make it possible to know about the behaviour of the material by calculations based on quantum mechanics, without the requirement of higher-order material properties. Other than the beneficial interactions, it is also important to know the hazardous consequences of engineered nanostructures and NPs can penetrate more deeply into the human body, and computational nanotoxicology has emerged as a potential strategy to predict the delirious effects of NMs. Although computational tools are helpful, yet more studies like <em>in vitro</em> assays are still required to get the complete picture, which is essential in the development of potent and safe drug entities.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"117 ","pages":"Article 108435"},"PeriodicalIF":2.6,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143725415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Functionalized p-cymene and pyrazine derivatives: Physicochemical, ADMT, drug-likeness, and DFT studies","authors":"Goncagül Serdaroğlu","doi":"10.1016/j.compbiolchem.2025.108434","DOIUrl":"10.1016/j.compbiolchem.2025.108434","url":null,"abstract":"<div><div>The <em>p</em>-cymene and pyrazine derivatives functionalized with the hydroxy and methoxy group(s) were under the focus to explore the electronic structural properties, which would play a critical role in the biochemical reactivity features via performing systematic computational analyses. The DFT computations of the data set were performed by B3LYP/6–311 G* * level to predict the structural and electronic properties as well as the physicochemical values. The physicochemical properties such as lipophilicity and water solubility features were determined because these values should be in balance with each other in early-stage-drug-design research. The averaged lipophilicity of the <em>p-</em>cymene and pyrazine derivatives were calculated as CYM3 (2.39)< CYM1 (2.82)< CYM4 (3.11)< CYM2 (3.21)< CYM (3.50) and PYZ3 (1.22)< PYZ (1.28)< PYZ1 (1.40)< PYZ2 (1.79)< PYZ4 (2.00), respectively. According to the ESOL approach, the water solubility (mg/mL)x10<sup>−2</sup> values of the <em>p-</em>cymene and pyrazine compounds were changed in the following orders of CYM3 (15.6)> CYM4 (10.2)> CYM1 (7.40)>CYM2 (5.16)> CYM (3.12) and PYZ (512)> PYZ1 (170)> PYZ3 (166)> PYZ2 (118)> PYZ4 (77.3), respectively. The ADMT properties of the data set were dealt with in detail to estimate the structural advantage or disadvantage because the possible side effects on human-health and the environment have to be considered in designing the novel agent in addition to the possible potencies. All compounds would be promising agents in terms of the Caco-2 and MDCK penetration and Pgp-inhibition potencies. According to the IGC<sub>50</sub>, LC<sub>50</sub>FM, and LC<sub>50</sub>DM results, the <em>p</em>-cymene compounds could have lower (or no) risk than the glyphosate and pyrazine derivatives like being for BCF scores. The FMO analyses were performed to estimate the possible reactive region for nucleophilic or electrophilic attacks.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"117 ","pages":"Article 108434"},"PeriodicalIF":2.6,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143734879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanfei Mo , Yaoqi Ge , Dan Wang , Jizheng Wang , Rihua Zhang , Yifang Hu , Xiaoxuan Qin , Yanyan Hu , Shan Lu , Yun Liu , Wen-Song Zhang
{"title":"Comprehensive analysis of single-cell and bulk transcriptome unravels immune landscape of atherosclerosis and develops a S100 family based-diagnostic model","authors":"Yanfei Mo , Yaoqi Ge , Dan Wang , Jizheng Wang , Rihua Zhang , Yifang Hu , Xiaoxuan Qin , Yanyan Hu , Shan Lu , Yun Liu , Wen-Song Zhang","doi":"10.1016/j.compbiolchem.2025.108436","DOIUrl":"10.1016/j.compbiolchem.2025.108436","url":null,"abstract":"<div><h3>Background</h3><div>The S100 family of calcium-binding proteins (S100s) had been tightly related to the biological processes of various cardiovascular diseases. This study aims to investigate the expression of S100s in Atherosclerosis (AS) and explore their potential as diagnostic biomarkers and therapeutic targets.</div></div><div><h3>Methods</h3><div>We analyzed multiple sequencing datasets from the GEO database to compare the expression profiles of S100s in AS tissues versus normal samples. Employing unsupervised clustering techniques, AS subtypes were discerned based on the intricate variations in S100-related gene expression profiles. Subsequent analyses delved into immune cell infiltration and GSVA pathway enrichment, shedding light on the nuanced immune landscape characterizing diverse AS subtypes. Machine learning techniques were employed to develop a diagnostic model for AS. Single-cell RNA analysis was utilized to investigate the cellular distribution of S100 hub genes in AS.</div></div><div><h3>Results</h3><div>Unsupervised clustering analysis identified two distinct AS subtypes (C1 and C2), characterized by specific S100 gene expression patterns. The RF-based diagnostic model exhibited the highest efficacy (AUC=0.881), and the top five genes (S100A4, S100A10, S100A11, S100A13, S100Z) were used to construct a diagnostic nomogram.</div></div><div><h3>Conclusion</h3><div>This study systematically elucidates the roles of S100s in AS, offering insights into molecular subtyping, immune characteristics, and diagnostic model construction. The findings provide valuable implications for the precise treatment and prognosis assessment of AS and pave the way for further research into related mechanisms.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"117 ","pages":"Article 108436"},"PeriodicalIF":2.6,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143734880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In silico discovery of novel compounds for FAK activation using virtual screening, AI-based prediction, and molecular dynamics","authors":"Deokhyeon Yoon , Hyunsu Lee","doi":"10.1016/j.compbiolchem.2025.108420","DOIUrl":"10.1016/j.compbiolchem.2025.108420","url":null,"abstract":"<div><div>Focal Adhesion Kinase (FAK) is a non-receptor tyrosine kinase that plays a crucial role in cell proliferation, migration, and signal transduction. FAK is overexpressed in metastatic and advanced-stage cancers, where it is considered a key kinase in cancer growth and metastasis. However, recent research has revealed that FAK activity decreases in various diseases. we aimed to identify compounds that could enhance FAK activity using structure-based virtual screening and artificial intelligence models from a vast chemical database. We began with an extensive chemical database containing over 10 million compounds and used our newly developed pipeline to screen candidate molecules. To select compounds structurally similar to ZINC40099027 (ZN27), a known FAK activator, we calculated Tanimoto Similarity scores and chose compounds with a score of at least 0.8. Clustering was performed using K-means based on the molecular properties. Subsequently, we utilized docking simulation, deep learning and SAScorer to evaluate and predict the protein–ligand docking affinity and physicochemical properties of the candidate compounds. The deep learning models were selected as state-of-the-art models: GLAM predicts the blood–brain barrier permeability of FAK, and elEmBERT predicts the potential toxicity of compound. The combined results were used to create an evaluation matrix. We selected 10 promising candidate compounds from the initial dataset of 10 million. To evaluate the stability of these top 10 candidate compounds in interaction with the FAK protein, we conducted Molecular Dynamics (MD) simulations. We performed a molecular dynamics simulation for a total of 50 ns and identified the top three promising candidate compounds.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"117 ","pages":"Article 108420"},"PeriodicalIF":2.6,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143724059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lung cancer detection and classification using optimized CNN features and Squeeze-Inception-ResNeXt model","authors":"Geethu Lakshmi G, P. Nagaraj","doi":"10.1016/j.compbiolchem.2025.108437","DOIUrl":"10.1016/j.compbiolchem.2025.108437","url":null,"abstract":"<div><div>Lung cancer, with its high mortality rate, is one of the deadliest diseases globally. The alarming increase in lung cancer deaths and its widespread prevalence have led to the development of various cancer control research and early detection methods aimed at reducing mortality rates. Effective diagnostic techniques are crucial for lowering lung cancer incidence, as early detection significantly impacts treatment success. Human error can often impede accurate identification of lung nodules, in which Computer-Aided Diagnostic (CAD) systems are utilized. These systems help radiologists by automating diagnostic processes and improving accuracy of detecting and classifying malignancies. This paper aims to develop a deep learning approach for classifying lung diseases using chest Computed Tomography (CT) scan images. The approach starts with image pre-processing, including color space conversion, data augmentation, resizing, and normalization. Feature extraction is carried out using a Convolutional Neural Network (CNN) optimized with Slime Mould Algorithm (SMA). For classification, a novel approach combining Squeeze-Inception V3 with ResNeXt, referred to as Squeeze-Inception-ResNeXt, is proposed. The Squeeze-Inception-ResNeXt model benefits from reduced computational cost while maintaining high performance in classifying lung diseases. This model categorizes lung diseases into Adenocarcinoma, Large Cell Carcinoma, and Squamous Cell Carcinoma. Additionally, SMA is utilized in training the Squeeze-Inception-ResNeXt model. Experimental results show that Squeeze-Inception-ResNeXt surpasses traditional models, with an accuracy of 97.7 %, sensitivity of 98.1 %, and specificity of 97.4 %.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"117 ","pages":"Article 108437"},"PeriodicalIF":2.6,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143734867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shanyang Ding, Dongjiang Niu, Mingxuan Li, Zhixin Zhang, Zhen Li
{"title":"Drug–drug interaction prediction based on graph contrastive learning and dual-view fusion","authors":"Shanyang Ding, Dongjiang Niu, Mingxuan Li, Zhixin Zhang, Zhen Li","doi":"10.1016/j.compbiolchem.2025.108426","DOIUrl":"10.1016/j.compbiolchem.2025.108426","url":null,"abstract":"<div><div>Drug–drug interaction (DDI) is important in drug research and are one of the major causes of morbidity and mortality. The deep learning methods can automatically extract drug features from molecular graphs or drug-related networks, which improves the performance of DDI prediction. However, there is noise and incomplete data in existing datasets, and the volume of dataset is limited. In order to fully utilize the knowledge graph network and the molecular structure, we propose a dual-view fusion model GDF-DDI. In one view, the knowledge graph network and drug similarity network are constructed as the global information, and two graph convolution operations are implemented on both networks to extract drug embeddings. Subsequently, layer wise graph contrastive learning is performed to update the drug embeddings to captures richer semantic information. In the other view, the self-supervised learning is utilized to extract more comprehensive embedding of drugs. The embeddings under two views are concatenated to cover the global and local DDI information. The comparative experiments on two datasets show that our model outperforms other recent and state-of-the-art baselines.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"117 ","pages":"Article 108426"},"PeriodicalIF":2.6,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143696633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Incorporating time as a third dimension in transcriptomic analysis using machine learning and explainable AI","authors":"Zubaida Said Ameen , Auwalu Saleh Mubarak , Mohamed Hamad , Rifat Hamoudi , Sherlyn Jemimah , Dilber Uzun Ozsahin , Mawieh Hamad","doi":"10.1016/j.compbiolchem.2025.108432","DOIUrl":"10.1016/j.compbiolchem.2025.108432","url":null,"abstract":"<div><div>Transcriptomic data analysis entails the measurement of RNA transcript (gene expression products) abundance in a cell or a cell population at a single point in time. In other words, transcriptomics as it is currently practiced is two-dimensional (2DTA). Gene expression profiling by 2DTA has proven invaluable in furthering our understanding of numerous biological processes in health and disease. That said, shortcomings including technical variability, small sample size, differential rates of transcript decay, and the lack of linearity between transcript abundance and functionality or the formation of functional proteins limit the interpretive utility and generalizability of transcriptomic data. 2DTA utility may also be constrained by its reliance on RNA extracts obtained at a single time point. In other words, much like judging a movie by a single frame, 2DTA can only provide a snapshot of the transcriptome at time of RNA extraction. Whether this perceived “temporality” problem is real and whether it has any bearing on transcriptomic data interpretation have yet to be addressed. To investigate this problem, 25 publicly available datasets relating to MCF-7 cells, where RNA extracts obtained at 12- or 48-hours post-culture were subjected to transcriptomic analysis. The individual datasets were downloaded and compiled into two separate datasets (MCF-7 U12hr and MCF-7 U48hr). To comparatively analyze the two compiled datasets, three machine learning approaches (decision trees (DT), random forests (RF), and XGBoost (Extreme Gradient Boosting)) were used as classifiers to search for genes with distinct expression patterns between the two groups. Shapley additive explanation (SHAP), an explainable AI method, was used to assess the fundamental principles of the DT, RF, and XGBoost models. Coefficient of Determination (DC), Mean Absolute Error (MAE), and Mean Squared Error (MSE) were used to evaluate the models. The results show that the two datasets exhibited very significant gene expression patterns. The XGBoost model performed better than the DT or RF models with MSE, MAE, and DC values of 0.00028, 0.00028, and 0.95778 respectively. These observations suggest that time, as a third dimension, can impact transcriptomic data interpretation and that machine learning and explainable AI are useful tools in resolving the temporality problem in transcriptomics.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"117 ","pages":"Article 108432"},"PeriodicalIF":2.6,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143685159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of potential therapeutic targets for stroke using data mining, network analysis, enrichment, and docking analysis","authors":"Mahdi Hatamipour , Hossein Saremi , Prashant Kesharwani , Amirhossein Sahebkar","doi":"10.1016/j.compbiolchem.2025.108431","DOIUrl":"10.1016/j.compbiolchem.2025.108431","url":null,"abstract":"<div><div>Stroke is a leading cause of disability and death worldwide. In this study, we identified potential therapeutic targets for stroke using a data mining, network analysis, enrichment, and docking analysis approach. We first identified 1991 genes associated with stroke from two publicly available databases: GeneCards and DisGeNET. We then constructed a protein-protein interaction (PPI) network using the STRING database and identified 1301 nodes and 5413 edges. We used Metascape to perform GO enrichment analysis and KEGG pathway enrichment analysis. The results of these analyses identified ten hub genes (TNF, IL6, ACTB, AKT1, IL1B, TP53, VEGFA, STAT3, CASP3, and CTNNB1) and five KEGG pathways (cancer, lipid and atherosclerosis, cytokine–cytokine receptor interaction, AGE RAGE signaling pathway in complications, and TNF signaling pathway) that are enriched in stroke genes. We then performed molecular docking analysis to screen potential drug candidates for these targets. The results of this analysis identified several promising drug candidates that could be used to develop new therapeutic strategies for stroke.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"117 ","pages":"Article 108431"},"PeriodicalIF":2.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143685158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}