Biodata MiningPub Date : 2026-01-24DOI: 10.1186/s13040-025-00515-z
Zhihui Xiong, Yi Yuan, Zhouhui Yun, Lijie Li, Yunmeng Chen
{"title":"Construction of an interpretable machine learning model for predicting gestational diabetes mellitus based on 45 dietary nutrients.","authors":"Zhihui Xiong, Yi Yuan, Zhouhui Yun, Lijie Li, Yunmeng Chen","doi":"10.1186/s13040-025-00515-z","DOIUrl":"10.1186/s13040-025-00515-z","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":"12"},"PeriodicalIF":6.1,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12874857/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146044453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2026-01-19DOI: 10.1186/s13040-025-00517-x
Muhammad Attique, Yaser Daanial Khan, Fahad Alturise, Tamim Alkhalifah
{"title":"Deep learning based prediction of RNA 5hmC modifications using composite feature representations and comparative benchmarking with transformer models.","authors":"Muhammad Attique, Yaser Daanial Khan, Fahad Alturise, Tamim Alkhalifah","doi":"10.1186/s13040-025-00517-x","DOIUrl":"10.1186/s13040-025-00517-x","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":"15"},"PeriodicalIF":6.1,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12874776/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146004492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2026-01-10DOI: 10.1186/s13040-025-00512-2
Debora Garza-Hernandez, Emmanuel Martinez-Ledesma, Victor Trevino
{"title":"Genotype subtyping approach to identify unnoticed variants in diseases from GWAS data.","authors":"Debora Garza-Hernandez, Emmanuel Martinez-Ledesma, Victor Trevino","doi":"10.1186/s13040-025-00512-2","DOIUrl":"10.1186/s13040-025-00512-2","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":"8"},"PeriodicalIF":6.1,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12875041/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145949375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2026-01-08DOI: 10.1186/s13040-025-00516-y
Geletaw Sahle Tegenaw, Hailin Song, Tomas Ward
{"title":"Generalization or mirage? Data leakage and reported performance in neonatal EEG seizure detection models: a systematic review.","authors":"Geletaw Sahle Tegenaw, Hailin Song, Tomas Ward","doi":"10.1186/s13040-025-00516-y","DOIUrl":"https://doi.org/10.1186/s13040-025-00516-y","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":""},"PeriodicalIF":6.1,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145935756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2025-12-22DOI: 10.1186/s13040-025-00506-0
Isibor Kennedy Ihianle, Wathsala Samarasekara, Keeley Brookes, Pedro Machado
{"title":"Cross-cohort genetic risk prediction for Alzheimer's disease: a transfer learning approach using GWAS and deep learning models.","authors":"Isibor Kennedy Ihianle, Wathsala Samarasekara, Keeley Brookes, Pedro Machado","doi":"10.1186/s13040-025-00506-0","DOIUrl":"10.1186/s13040-025-00506-0","url":null,"abstract":"","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":"89"},"PeriodicalIF":6.1,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12752400/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145811982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2025-12-22DOI: 10.1186/s13040-025-00498-x
Yujie Huo, Weng Howe Chan, Ahmad Najmi Bin Amerhaider Nuar, Hongyu Gao
{"title":"SBT-Net: a tri-cue guided multimodal fusion framework for depression recognition.","authors":"Yujie Huo, Weng Howe Chan, Ahmad Najmi Bin Amerhaider Nuar, Hongyu Gao","doi":"10.1186/s13040-025-00498-x","DOIUrl":"10.1186/s13040-025-00498-x","url":null,"abstract":"<p><p>Early detection of depression is vital for public health, yet current multimodal methods often struggle with challenges such as modality incompleteness, semantic inconsistency, and emotional temporal fluctuation. To address these issues, this paper proposes SBT-Net, a novel Semantic-Bias-Trend guided framework for robust depression detection from audio and text data. The model integrates three innovative modules: a semantically guided cross-modal gating (SGCMG) mechanism that dynamically filters effective modality features based on global semantic cues, a bias-guided tensor product attention (BG-TPA) mechanism that enhances fine-grained fusion and alignment between modalities, and an emotion trend modeling (ETM) module that captures the temporal evolution of depressive emotional states.We evaluate SBT-Net using two widely adopted benchmark datasets: DAIC-WOZ, which contains 189 interview sessions, and EATD-Corpus, comprising 162 conversational samples. Experimental results show that SBT-Net achieves excellent performance in multiple indicators, including 93.0% accuracy, 0.93 F1 score, and 0.92 recall, all of which surpass the competitive baselines.Ablation studies further validate the individual and synergistic contributions of each proposed module.These findings highlight the potential of integrating semantic guidance, bias-aware fusion, and emotional trend modeling to advance multimodal depression detection solutions. The source code can be found at https://github.com/ghy-yhg/SBT-Net .</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"86"},"PeriodicalIF":6.1,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12723886/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145811969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodata MiningPub Date : 2025-12-22DOI: 10.1186/s13040-025-00500-6
Roberta Coletti, J Orestes Cerdeira, Marcos Raydan, Marta B Lopes
{"title":"An unsupervised tool for biomarker discovery and cancer subtyping applied to glioblastoma.","authors":"Roberta Coletti, J Orestes Cerdeira, Marcos Raydan, Marta B Lopes","doi":"10.1186/s13040-025-00500-6","DOIUrl":"10.1186/s13040-025-00500-6","url":null,"abstract":"<p><strong>Background: </strong>High-dimensional omics data often contain more variables than observations, which can lead to overfitting and negatively impact the results of classical data analysis methods. To address the issue, supervised variable selection methods are often used, incorporating penalty terms into the model. While effective for selecting task-specific variables, this approach may not preserve the overall dataset structure for multiple downstream analyses. This study aims to evaluate unsupervised variable selection approaches and introduce a novel tool that improves data interpretability while maintaining biological information.</p><p><strong>Results: </strong>We assessed multiple unsupervised variable selection techniques to identify a representative subset of the original dataset. Based on this evaluation, we developed TRIM-IT, a computational tool that integrates unsupervised variable selection, clustering, survival analysis, and differential gene expression analysis. TRIM-IT was applied to glioblastoma transcriptomics data, uncovering three distinct patient clusters. These clusters correlated with tumor histology, exhibited significantly different survival outcomes, and revealed molecular profiles that suggest potential biomarker candidates.</p><p><strong>Conclusion: </strong>TRIM-IT provides a novel approach for analyzing high-dimensional omics data while preserving key biological insights. Its ability to identify meaningful patient subgroups and molecular signatures highlights its applicability across various biomedical research contexts. The tool is implemented in R and the code is publicly available for reproduction and adaptation to other studies.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"85"},"PeriodicalIF":6.1,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12720448/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145811993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep learning to predict emergency department revisit using static and dynamic features (Deep Revisit): development and validation study.","authors":"Su-Yin Hsu, Jhe-Yi Jhu, Jun-Wan Gao, Chien-Hua Huang, Chu-Lin Tsai, Li-Chen Fu","doi":"10.1186/s13040-025-00509-x","DOIUrl":"10.1186/s13040-025-00509-x","url":null,"abstract":"<p><strong>Background: </strong>Emergency Department (ED) revisits represent a critical issue in emergency medicine. Identifying high-risk revisit cases (revisits with intensive care unit admissions, cardiac arrest, or requiring emergency surgery) is particularly important. While prior studies have explored machine learning models for ED revisit prediction, few deep learning approaches exist, and dynamic features remain underutilized.</p><p><strong>Methods: </strong>We used data from National Taiwan University Hospital (NTUH), incorporating both static (e.g., age, sex, triage) and dynamic (vital signs) features. A preprocessing strategy was developed to handle temporal irregularities. We proposed a hybrid deep learning model combining Temporal Convolutional Network (TCN) and feature tokenizer (FT)-Transformer to integrate static and short-term dynamic information.</p><p><strong>Results: </strong>We evaluated our model on NTUH 2016-2019 data, achieving the area under the receiver operating characteristic curve (AUROC) of 0.8453 and the area under precision recall curve (AUPRC) of 0.0935 for high-risk revisits (base rate = 0.01), and AUROC of 0.7250 and AUPRC of 0.2005 for general revisits (base rate = 0.042). The model maintained robust performance when validated on 2020-2022 data. Compared to the static-only logistic regression baseline, our model improved AUPRC from 0.0288 to 0.0935 and precision from 0.0281 to 0.0428.</p><p><strong>Conclusion: </strong>Our model significantly outperformed the static-only baseline. It demonstrates the effectiveness of multimodal clinical data fusion in improving ED revisit prediction and supporting clinical decision-making.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":" ","pages":"88"},"PeriodicalIF":6.1,"publicationDate":"2025-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750547/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145800645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}