Bioinformatics advances最新文献

筛选
英文 中文
A fast method for extracting essential and synthetic lethality genes in GEM models. 一种快速提取GEM模型必需和合成致死基因的方法。
IF 2.4
Bioinformatics advances Pub Date : 2025-06-06 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf127
Francisco Guil, José M García
{"title":"A fast method for extracting essential and synthetic lethality genes in GEM models.","authors":"Francisco Guil, José M García","doi":"10.1093/bioadv/vbaf127","DOIUrl":"10.1093/bioadv/vbaf127","url":null,"abstract":"<p><strong>Summary: </strong>Exploring and categorizing essential and synthetic lethality genes is crucial in developing effective and targeted therapies for various diseases. This endeavor hinges upon genetic minimal cut sets, which also find utility in metabolic engineering. Different methods have been suggested for calculating genetic minimal cut sets. Still, with the emergence of numerous new models and their increasing complexity, it has become essential to introduce new algorithms in this field. This paper presents a new algorithmic approach for computing genetic minimal cut sets, which utilizes linear programming techniques to improve temporal efficiency. The key concept of the method is to use a k-representative subset to replace the target set with a smaller, yet representative, one. We have analyzed its efficiency in terms of running times compared to gMCSPy, the most recent published research on computing genetic minimal cut sets.</p><p><strong>Availability and implementation: </strong>Software and additional material are freely available at https://github.com/biogacop/fastMethod.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf127"},"PeriodicalIF":2.4,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12240467/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144602407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pharmacological assessment of Coffea arabica compounds as potential therapeutics for cervical cancer. 阿拉比卡咖啡化合物作为宫颈癌潜在治疗药物的药理学评价。
IF 2.4
Bioinformatics advances Pub Date : 2025-06-05 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf132
Victor Omoboyede, Nwachukwu Christiana Okonkwo, Jimoh Olayemi Balogun, Onyekachi Victor Onyedikachi, Rita Ononiwu, Daniel Okpaise, Sarah Olanrewaju Oladejo, Christopher Busayo Olowosoke, Haruna Isiyaku Umar, Prosper Obed Chukwuemeka
{"title":"Pharmacological assessment of <i>Coffea arabica</i> compounds as potential therapeutics for cervical cancer.","authors":"Victor Omoboyede, Nwachukwu Christiana Okonkwo, Jimoh Olayemi Balogun, Onyekachi Victor Onyedikachi, Rita Ononiwu, Daniel Okpaise, Sarah Olanrewaju Oladejo, Christopher Busayo Olowosoke, Haruna Isiyaku Umar, Prosper Obed Chukwuemeka","doi":"10.1093/bioadv/vbaf132","DOIUrl":"10.1093/bioadv/vbaf132","url":null,"abstract":"<p><strong>Motivation: </strong>Cervical cancer remains a leading cause of gynecological mortality, with existing treatments often limited by resistance and suboptimal efficacy. While <i>Coffea arabica</i> is rich in phytochemicals with reported anticancer properties, their relevance to cervical cancer-specific molecular targets remains underexplored. Here, we integrated transcriptomic profiling, cheminformatics, and survival modeling to evaluate the therapeutic potential of <i>C. arabica</i> compounds in cervical cancer.</p><p><strong>Results: </strong>From 158 bioactive compounds with favorable pharmacokinetic and drug-likeness properties, we predicted gene targets and intersected them with 1779 differentially expressed genes identified from bulk RNA-sequencing of 304 cervical cancer tumors and 47 normal cervical tissues. This yielded 43 <i>C. arabica</i> gene targets that were significantly dysregulated in cervical cancer. Pathway enrichment revealed involvement in tumorigenesis, immune modulation, and cell cycle regulation, with fold enrichment computed as the ratio of observed-to-expected gene overlap. Survival analysis identified 14 of these genes as markers of poor prognosis, with matrix metalloproteinase-7 (MMP7) emerging as an independent prognostic marker of adverse outcome. A Random-Forest-Regression model trained on 499 experimentally validated MMP7 inhibitors identified carnosol-a <i>C. arabica</i> compound-as a top-ranking candidate with high predicted activity. These findings nominate carnosol as a promising therapeutic lead for cervical cancer and lay the groundwork for future experimental validation.</p><p><strong>Availability and implementation: </strong>The data supporting the findings of this study, including bulk RNA-seq gene expression data, survival, and phenotype data, are available through the TCGA database. These data can be accessed via the Xenabrowser platform (https://xenabrowser.net) using the reference identifier [TCGA Cervical Cancer (CESC)]. Corresponding healthy cervical tissue RNA-seq data, are available through the Genotype-Tissue Expression (GTEx) project (https://www.gtexportal.org/home/). The codes used for differential gene expression (DGE) analysis, pathway enrichment, and survival analysis, as well as scripts for generating volcano plots (DGE analysis), Kaplan-Meier survival plots, and boxplots (gene expression), and machine learning implementations are available on GitHub (https://github.com/Ponaskillzyy/Coffea_arabica_Potential_in_Cervical_Cancer).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf132"},"PeriodicalIF":2.4,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12212767/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144546297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harpy: a pipeline for processing haplotagging linked-read data. Harpy:处理单倍标记链接读数据的管道。
IF 2.4
Bioinformatics advances Pub Date : 2025-06-05 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf133
Pavel V Dimens, Ryan P Franckowiak, Azwad Iqbal, Jennifer K Grenier, Paul R Munn, Nina Overgaard Therkildsen
{"title":"Harpy: a pipeline for processing haplotagging linked-read data.","authors":"Pavel V Dimens, Ryan P Franckowiak, Azwad Iqbal, Jennifer K Grenier, Paul R Munn, Nina Overgaard Therkildsen","doi":"10.1093/bioadv/vbaf133","DOIUrl":"10.1093/bioadv/vbaf133","url":null,"abstract":"<p><strong>Motivation: </strong>Haplotagging is a method for linked-read sequencing, which leverages the cost-effectiveness and throughput of short-read sequencing while retaining part of the long-range haplotype information captured by long-read sequencing. Despite its utility and advantages over similar methods, existing linked-read analytical pipelines are incompatible with haplotagging data.</p><p><strong>Results: </strong>We describe Harpy, a modular and user-friendly software pipeline for processing all stages of haplotagged linked-read data, from raw sequence data to phased genotypes and structural variant detection.</p><p><strong>Availability and implementation: </strong>https://github.com/pdimens/harpy.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf133"},"PeriodicalIF":2.4,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12198493/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144509792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SynDRep: a synergistic partner prediction tool based on knowledge graph for drug repurposing. SynDRep:基于知识图谱的药物再利用协同伙伴预测工具。
IF 2.4
Bioinformatics advances Pub Date : 2025-06-05 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf092
Karim S Shalaby, Sathvik Guru Rao, Bruce Schultz, Martin Hofmann-Apitius, Alpha Tom Kodamullil, Vinay Srinivas Bharadhwaj
{"title":"SynDRep: a synergistic partner prediction tool based on knowledge graph for drug repurposing.","authors":"Karim S Shalaby, Sathvik Guru Rao, Bruce Schultz, Martin Hofmann-Apitius, Alpha Tom Kodamullil, Vinay Srinivas Bharadhwaj","doi":"10.1093/bioadv/vbaf092","DOIUrl":"10.1093/bioadv/vbaf092","url":null,"abstract":"<p><strong>Motivation: </strong>Drug repurposing is gaining interest due to its high cost-effectiveness, low risks, and improved patient outcomes. However, most drug repurposing methods depend on drug-disease-target semantic connections of a single drug rather than insights from drug combination data. In this study, we propose SynDRep, a novel drug repurposing tool based on enriching knowledge graphs (KG) with drug combination effects. It predicts the synergistic drug partner with a commonly prescribed drug for the target disease, leveraging graph embedding and machine learning (ML) techniques. This partner drug is then repurposed as a single agent for this disease by exploring pathways between them in the KG.</p><p><strong>Results: </strong>HolE was the best-performing embedding model (with 84.58% of true predictions for all relations), and random forest emerged as the best ML model with an area under the receiver operating characteristic curve (ROC-AUC) value of 0.796. Some of our selected candidates, such as miconazole and albendazole for Alzheimer's disease, have been validated through literature, while others lack either a clear pathway or literature evidence for their use for the disease of interest. Therefore, complementing SynDRep with more specialized KGs, and additional training data, would enhance its efficacy and offer cost-effective and timely solutions for patients.</p><p><strong>Availability and implementation: </strong>SynDRep is available as an open-source Python package at https://github.com/SynDRep/SynDRep under the Apache 2.0 License.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf092"},"PeriodicalIF":2.4,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12148216/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144259500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OReO: optimizing read order for practical compression. OReO:为实际压缩优化读顺序。
IF 2.4
Bioinformatics advances Pub Date : 2025-06-03 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf128
Mathilde Girard, Léa Vandamme, Bastien Cazaux, Antoine Limasset
{"title":"OReO: optimizing read order for practical compression.","authors":"Mathilde Girard, Léa Vandamme, Bastien Cazaux, Antoine Limasset","doi":"10.1093/bioadv/vbaf128","DOIUrl":"10.1093/bioadv/vbaf128","url":null,"abstract":"<p><strong>Motivation: </strong>Recent advances in high-throughput and third-generation sequencing technologies have created significant challenges in storing and managing the rapidly growing volume of read datasets. Although more than 50 specialized compression tools have been developed, employing methods such as reference-based approaches, customized generic compressors, and read reordering, many users still rely on common generic compressors (e.g. gzip, zstd, xz) for convenience, portability, and reliability, despite their low compression ratios. Here, we introduce Optimizing Read Order (OReO), a simple read-reordering framework that achieves high compression performance without requiring specialized software for decompression. By grouping overlapping reads together before applying generic compressors, OReO exploits inherent redundancies in sequencing data and achieves compression ratios on par with state-of-the-art tools. Moreover, because it relies only on standard decompressors, OReO avoids the need for dedicated installations and maintenance, removing a key barrier to practical adoption.</p><p><strong>Results: </strong>We evaluated OReO on both Oxford Nanopore Technologies (ONT) and HiFi genomic and metagenomic datasets of varying sizes and complexities. Our results demonstrate that OReO provides substantial compression gains with comparable resource usage and outperforms dedicated methods in decompression speed. We propose that future compression strategies should focus on reordering as a means to let generic compression tools fully exploit data redundancy, offering an efficient, sustainable, and user-friendly solution to the growing challenges of sequencing data storage.</p><p><strong>Availability and implementation: </strong>The OReO code is open source and available at github.com/girunivlille/oreo.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf128"},"PeriodicalIF":2.4,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12185860/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144487289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting gene expression using millions of yeast promoters reveals cis-regulatory logic. 利用数百万个酵母启动子预测基因表达揭示了顺式调控逻辑。
IF 2.4
Bioinformatics advances Pub Date : 2025-06-02 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf130
Tirtharaj Dash, Susanne Bornelöv
{"title":"Predicting gene expression using millions of yeast promoters reveals <i>cis</i>-regulatory logic.","authors":"Tirtharaj Dash, Susanne Bornelöv","doi":"10.1093/bioadv/vbaf130","DOIUrl":"10.1093/bioadv/vbaf130","url":null,"abstract":"<p><strong>Motivation: </strong>Gene regulation involves complex interactions between transcription factors. While early attempts to predict gene expression were trained using naturally occurring promoters, gigantic parallel reporter assays have vastly expanded potential training data. Despite this, it is still unclear how to best use deep learning to study gene regulation. Here, we investigate the association between promoters and expression using Camformer, a residual convolutional neural network that ranked fourth in the Random Promoter DREAM Challenge 2022. We present the original model trained on 6.7 million sequences and investigate 270 alternative models to find determinants of model performance. Finally, we use explainable AI to uncover regulatory signals.</p><p><strong>Results: </strong>Camformer accurately decodes the association between promoters and gene expression ( <math> <mrow> <mrow> <msup><mrow><mi>r</mi></mrow> <mn>2</mn></msup> </mrow> <mo>=</mo> <mn>0.914</mn> <mo> ± </mo> <mn>0.003</mn></mrow> </math> , <math><mrow><mi>ρ</mi> <mo>=</mo> <mn>0.962</mn> <mo> ± </mo> <mn>0.002</mn></mrow> </math> ) and provides a substantial improvement over previous state of the art. Using Grad-CAM and in silico mutagenesis, we demonstrate that our model learns both individual motifs and their hierarchy. For example, while an IME1 motif on its own increases gene expression, a co-occurring UME6 motif instead strongly reduces gene expression. Thus, deep learning models such as Camformer can provide detailed insights into <i>cis</i>-regulatory logic.</p><p><strong>Availability and implementation: </strong>Data and code are available at: https://github.com/Bornelov-lab/Camformer.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf130"},"PeriodicalIF":2.4,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12188188/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144499625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causeway: a pipeline for genome-wide effector gene screening with Mendelian Randomization and colocalization. Causeway:用孟德尔随机化和共定位筛选全基因组效应基因的管道。
IF 2.4
Bioinformatics advances Pub Date : 2025-05-29 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf110
Julia A de Amorim, João Vitor F Cavalcante, Diego Marques-Coelho, Rodrigo J S Dalmolin, Vasiliki Lagou
{"title":"Causeway: a pipeline for genome-wide effector gene screening with Mendelian Randomization and colocalization.","authors":"Julia A de Amorim, João Vitor F Cavalcante, Diego Marques-Coelho, Rodrigo J S Dalmolin, Vasiliki Lagou","doi":"10.1093/bioadv/vbaf110","DOIUrl":"10.1093/bioadv/vbaf110","url":null,"abstract":"<p><strong>Summary: </strong>The integration of quantitative trait loci and disease genome-wide association studies for pinpointing candidate causal genes is a computationally demanding task accompanied by pitfalls related to the methods used. To address these issues, we introduce Causeway, a novel Nextflow pipeline for performing summary statistics-based two sample Mendelian Randomization for causal gene prioritization. The pipeline executes sensitivity and colocalization analyses for interrogation of findings providing robust results. The tool is designed to run tasks in a computationally efficient way even in low-resource environments, such as a personal computer. Furthermore, it can scale to web servers and high-performance computing clusters.</p><p><strong>Availability and implementation: </strong>The source code of Causeway is available at GitHub https://github.com/juliaapolonio/Causeway, while the documentation and instructions to run the vignette at https://juliaapolonio.github.io/Causeway/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf110"},"PeriodicalIF":2.4,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12161984/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144287399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using pseudotime derivative on single-cell RNA sequencing data to identify genes undergoing cell cycle regulation. 利用单细胞RNA测序数据的伪时间衍生物来鉴定细胞周期调控的基因。
IF 2.4
Bioinformatics advances Pub Date : 2025-05-29 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf123
Yohan Lefol, Geir Amund Svan Hasle, Siv Anita Hegre, Helle Samdal, Pål Sætrom
{"title":"Using pseudotime derivative on single-cell RNA sequencing data to identify genes undergoing cell cycle regulation.","authors":"Yohan Lefol, Geir Amund Svan Hasle, Siv Anita Hegre, Helle Samdal, Pål Sætrom","doi":"10.1093/bioadv/vbaf123","DOIUrl":"10.1093/bioadv/vbaf123","url":null,"abstract":"<p><strong>Motivation: </strong>The cell cycle is a critical part of cellular life, one that has long been studied, both directly, and through its regulatory components. Commonly, cell cycle synchronization or selection experiments are performed in order to study the cell cycle, thus chemically modifying the cells, or selecting them for specific phases. We seek to develop a means to study the cell cycle through the use of single cell RNA sequencing, effectively circumventing the need for such experiments.</p><p><strong>Results: </strong>We utilize a well-established pseudotime method, along with the predicted and real expression of genes to calculate the velocity of individual genes. We then utilize statistics and expected biological behaviour to identify genes with significant shifts in velocity within the pseudotime. Additionally, we show the ability to observe gene regulatory behaviour such as mRNA splicing and degradation rates. As many cell line based research utilize multiple replicates we implement a merger method for technical replicates to adjust for technical variations, creating a more robust analysis. In summary, our study develops a robust approach to map the velocities of individual, biologically, and statistically significant genes throughout the cell cycle's phases within a cell line experiment.</p><p><strong>Availability and implementation: </strong>Data and code are available at: https://github.com/Ylefol/CC_vel.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf123"},"PeriodicalIF":2.4,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12255884/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144627875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ReSort enhances reference-based cell type deconvolution for spatial transcriptomics through regional information integration. ReSort通过区域信息整合增强了基于参考的细胞类型反褶积的空间转录组学。
IF 2.4
Bioinformatics advances Pub Date : 2025-05-27 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf091
Linhua Wang, Ling Wu, Guantong Qi, Chaozhong Liu, Wanli Wang, Xiang H-F Zhang, Zhandong Liu
{"title":"ReSort enhances reference-based cell type deconvolution for spatial transcriptomics through regional information integration.","authors":"Linhua Wang, Ling Wu, Guantong Qi, Chaozhong Liu, Wanli Wang, Xiang H-F Zhang, Zhandong Liu","doi":"10.1093/bioadv/vbaf091","DOIUrl":"10.1093/bioadv/vbaf091","url":null,"abstract":"<p><strong>Motivation: </strong>Spatial transcriptomics (ST) captures positional gene expression within tissues but lacks single-cell resolution. Reference-based cell type deconvolution methods were developed to understand cell type distributions for ST. However, batch/platform discrepancies between references and ST impact their accuracy.</p><p><strong>Results: </strong>We present Region-based Cell Sorting (ReSort), which utilizes ST's region-level data to lessen reliance on reference data and alleviate these technical issues. In simulation studies, ReSort enhances reference-based deconvolution methods. Applying ReSort to a mouse breast cancer model highlights macrophages M0 and M2 enrichment in the epithelial clone, revealing insights into epithelial-mesenchymal transition and immune infiltration.</p><p><strong>Availability and implementation: </strong>Source codes for ReSort are publicly available at (https://github.com/LiuzLab/RESORT), implemented in Python.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf091"},"PeriodicalIF":2.4,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12161990/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144287401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predictive machine learning model for 30-day hospital readmissions in a tertiary healthcare setting. 三级医疗机构30天再入院的预测机器学习模型
IF 2.4
Bioinformatics advances Pub Date : 2025-05-24 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf121
Diego Halac, Cecilia Cocucci, Sebastian Camerlingo
{"title":"Predictive machine learning model for 30-day hospital readmissions in a tertiary healthcare setting.","authors":"Diego Halac, Cecilia Cocucci, Sebastian Camerlingo","doi":"10.1093/bioadv/vbaf121","DOIUrl":"10.1093/bioadv/vbaf121","url":null,"abstract":"<p><strong>Motivation: </strong>Hospital readmissions represent a major challenge for healthcare systems due to their impact on patient outcomes and associated costs. As many readmissions are considered preventable, predictive modeling offers a valuable tool for early identification and intervention. This study aimed to develop and validate a predictive model for 30-day readmissions in a 200-bed community hospital in Argentina. A retrospective analysis was conducted on 3388 adult admissions. The primary endpoint was readmission within 30 days of discharge. Predictor variables included demographic and clinical factors such as age, length of stay, hypertension, diabetes, heart failure, coronary artery disease, stroke, cancer, dementia, chronic kidney disease, chronic obstructive pulmonary disease, and bedridden status. Three models-Logistic Regression (LR), Random Forest (RF), and LightGBM (LGBM)-were developed, with hyperparameter tuning via Bayesian optimization. Model performance was assessed using calibration, discrimination (C-statistics), and decision curve analysis. Internal validation was performed using 250 bootstrap resamples.</p><p><strong>Results: </strong>The readmission rate was 11% (<i>n</i> = 394). RF outperformed LR and LGBM in discrimination and clinical utility within predictive probability thresholds of 0.05-0.25. Optimism-corrected C-statistics were 0.60 (LR, LGBM) and 0.64 (RF); calibration slopes were 0.75 (LR), 1.13 (RF), and 1.76 (LGBM). Machine learning models, particularly RF, can improve readmission risk prediction and inform targeted healthcare interventions.</p><p><strong>Availability and implementation: </strong>The dataset and code used to develop and validate the predictive models are available from the corresponding author upon reasonable request. The implementation was conducted in R using the mlr3verse, pminternal, rms, dcurves, data.table, tidyverse, ranger and lightgbm packages, with Bayesian hyperparameter optimization via mlr3mbo.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf121"},"PeriodicalIF":2.4,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12158157/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信