Biodata MiningPub Date : 2026-04-03DOI: 10.1186/s13040-026-00548-y
Michael Zietz, Undina Gisladottir, Kathleen LaRow Brown, Nicholas P Tatonetti
{"title":"WebGWAS: a web server for instant GWAS on arbitrary phenotypes.","authors":"Michael Zietz, Undina Gisladottir, Kathleen LaRow Brown, Nicholas P Tatonetti","doi":"10.1186/s13040-026-00548-y","DOIUrl":"10.1186/s13040-026-00548-y","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2026-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147610349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2026-04-02DOI: 10.1186/s13040-026-00532-6
Giuseppe Defazio, Erika Lorusso, Mariangela De Robertis, Tommaso Mello, Andrea Galli, Graziano Pesole, Bruno Fosso
{"title":"Machine learning-based assessment of the healthy human gut mycobiota landscape using ITS1 DNA metabarcoding data.","authors":"Giuseppe Defazio, Erika Lorusso, Mariangela De Robertis, Tommaso Mello, Andrea Galli, Graziano Pesole, Bruno Fosso","doi":"10.1186/s13040-026-00532-6","DOIUrl":"https://doi.org/10.1186/s13040-026-00532-6","url":null,"abstract":"<p><p>The human gut microbiome plays a critical role in maintaining host health and homeostasis, and current literature suggests a bidirectional relationship between microbiome ecology and host well-being. DNA metabarcoding has emerged as a powerful tool for investigating microbiome imbalances (i.e., dysbiosis). While the prokaryotic microbiome has been extensively studied, the fungal counterpart - or mycobiome - remains largely unexplored, despite its recognized role from the perinatal stage onward. Here, we present a comprehensive survey based on DNA metabarcoding analysis of approximately 1,500 publicly available ITS1 samples. This survey integrates conventional statistical approaches with Machine Learning (ML) methods coupled with explainable Artificial Intelligence (XAI). ML models successfully predicted host health status with accuracies exceeding 80%, and fungal genera such as Eurotium, Aureobasidium, Candida, and Cutaneotrichosporon emerged as key classification features. This study introduces a cutting-edge multiview analytical framework applied to publicly available mycobiome data, highlighting the potential of fungal community profiling as a non-invasive tool to support health diagnostics.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2026-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147610394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2026-03-31DOI: 10.1186/s13040-026-00552-2
Viktor Vedelek, Balázs Vedelek, Rita Sinka
{"title":"Machine learning analysis of Drosophila testis transcriptomic data reveals potential regulatory sequences.","authors":"Viktor Vedelek, Balázs Vedelek, Rita Sinka","doi":"10.1186/s13040-026-00552-2","DOIUrl":"https://doi.org/10.1186/s13040-026-00552-2","url":null,"abstract":"<p><strong>Background: </strong>The number of accessible transcriptomic data is increasing rapidly due to advances in affordable sequencing technologies. There is also an improvement in the quality and resolution of gene expression maps, which was driven by the advent of single-cell technologies.</p><p><strong>Results: </strong>Here we present a method, where we integrate transcriptomic data from five different sources, including segmented, single cyst, and single-cell transcriptomic data from Drosophila testis to investigate the expression and accumulation of transcripts in this tissue. The analysis showed that the testis-specific genes have a characteristic profile in the testis, which is predictable using supervised machine learning algorithms (XGBoost). Moreover, dimension reduction with an unsupervised machine learning algorithm (t-SNE) followed by clustering (DBSCAN) of the genes revealed potential regulatory motifs shared by genes in the same group. This approach links the robustness of tissue expression data with the resolution of the single-cell techniques, while masking the weakness of each technique.</p><p><strong>Conclusions: </strong>The presented approach can be used to find similarly expressed genes and shared regulatory elements or new cell-specific transcripts that could have potential annotation benefits in further research.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2026-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147595678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2026-03-30DOI: 10.1186/s13040-026-00539-z
Ivan Ferrari, Elisa Arsuffi, Nicola Manfrini, Stefano Biffo
{"title":"CancerHubs Data Explorer: a web application for investigating mutation-enriched protein interaction hubs in human cancers.","authors":"Ivan Ferrari, Elisa Arsuffi, Nicola Manfrini, Stefano Biffo","doi":"10.1186/s13040-026-00539-z","DOIUrl":"10.1186/s13040-026-00539-z","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2026-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13051495/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147582169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2026-03-30DOI: 10.1186/s13040-026-00550-4
Sisi Shao, Pedro Henrique Ribeiro, Alena Orlenko, Katie M Cardone, Christina M Ramirez, Li Shen, Marylyn D Ritchie, Jason H Moore
{"title":"A biology-based quality-diversity algorithm for drug repurposing in Alzheimer's disease using automated machine learning.","authors":"Sisi Shao, Pedro Henrique Ribeiro, Alena Orlenko, Katie M Cardone, Christina M Ramirez, Li Shen, Marylyn D Ritchie, Jason H Moore","doi":"10.1186/s13040-026-00550-4","DOIUrl":"https://doi.org/10.1186/s13040-026-00550-4","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2026-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147582189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2026-03-28DOI: 10.1186/s13040-026-00545-1
Jeff Joseph, Christopher Niemczak, Jonathan Lichtenstein, Albert Magohe, Samantha Leigh, Linda Zhang, Enica Massawe, Jiang Gui, Jay C Buckey
{"title":"Harnessing machine learning with auditory tests and demographic factors to forecast children's reading abilities in children living with and without HIV.","authors":"Jeff Joseph, Christopher Niemczak, Jonathan Lichtenstein, Albert Magohe, Samantha Leigh, Linda Zhang, Enica Massawe, Jiang Gui, Jay C Buckey","doi":"10.1186/s13040-026-00545-1","DOIUrl":"10.1186/s13040-026-00545-1","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2026-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13151343/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147576157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2026-03-27DOI: 10.1186/s13040-026-00549-x
Davide Chicco, Srinjoy Dora, Luca Oneto
{"title":"DBSCAN applied to EHRs data from patients with glioblastoma clusters patients based on cytosolic Hsp70 protein, sex, and brain subventricular zone.","authors":"Davide Chicco, Srinjoy Dora, Luca Oneto","doi":"10.1186/s13040-026-00549-x","DOIUrl":"10.1186/s13040-026-00549-x","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2026-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13147812/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147522449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2026-03-24DOI: 10.1186/s13040-026-00531-7
Jungmin You, Jeongmin Kim, Jeongeun Choi, Bon-Nyeo Koo, Hyangkyu Lee
{"title":"Multi-output LSTM-based prediction of postoperative delirium: integrating baseline and perioperative data for enhanced risk stratification in older spine surgery patients.","authors":"Jungmin You, Jeongmin Kim, Jeongeun Choi, Bon-Nyeo Koo, Hyangkyu Lee","doi":"10.1186/s13040-026-00531-7","DOIUrl":"10.1186/s13040-026-00531-7","url":null,"abstract":"<p><strong>Introduction: </strong>Postoperative delirium (POD) adversely affects clinical outcomes among older adults undergoing spine surgery. However, existing predictive models often neglect multidimensional nature of delirium, including its clinical subtype, duration, severity, and timing. This study developed a multi-output Long Short-Term Memory (LSTM) neural network that integrates preoperative baseline characteristics and intraoperative acute stressors to predict multiple clinical dimensions of POD in elderly patients undergoing spinal surgery.</p><p><strong>Methods: </strong>This prospective observational study included 536 patients aged 70 or older who underwent elective spine surgery between November 2019 and May 2023. Comprehensive assessments were conducted during both the preoperative and intraoperative phases. The multi-output LSTM model incorporated preoperative baseline variables (demographic, frailty scores, cognitive function, medication count, and laboratory parameters) and intraoperative data (surgical invasiveness, duration of surgery and anesthesia, intraoperative fluid management, immediate postoperative medication use). Outcomes comprised delirium occurrence, subtype, duration, severity, and onset timing. Model performance was evaluated via accuracy, precision, recall, F1-score, and ROC curve analyses. SHapley Additive exPlanations (SHAP) analysis enhanced clinical interpretability.</p><p><strong>Results: </strong>Using solely preoperative baseline data, the model demonstrated strong predictive performance with an overall AUC of 0.76, particularly for delirium occurrence (AUC = 0.68), the duration (AUC = 0.80), and severity (AUC = 0.79). Incorporating intraoperative data substantially enhanced model performance, increasing the overall AUC to 0.81, notably improving predictions for delirium subtype (AUC up to 0.84), duration (AUC = 0.81), and onset timing (AUC up to 0.87). SHAP analysis consistently identified frailty, polypharmacy, cognitive impairment, nutritional deficiencies, and acute perioperative factors-such as surgical invasiveness, pain management-as pivotal predictors across delirium dimensions.</p><p><strong>Conclusion: </strong>The proposed multi-output LSTM model predicted multiple clinical dimensions of postoperative delirium, highlighting baseline health status as a primary determinant. Strategic integration of comprehensive baseline assessments with acute perioperative data substantially enhances predictive accuracy, informing personalized delirium prevention and management strategies for improved perioperative outcomes in older spine surgery patients.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2026-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13137560/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147516116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}