Changlong Wang , You Zhou , Yuanshu Li , Wei Pang , Liupu Wang , Wei Du , Hui Yang , Ying Jin
{"title":"ICPPNet: A semantic segmentation network model based on inter-class positional prior for scoliosis reconstruction in ultrasound images","authors":"Changlong Wang , You Zhou , Yuanshu Li , Wei Pang , Liupu Wang , Wei Du , Hui Yang , Ying Jin","doi":"10.1016/j.jbi.2025.104827","DOIUrl":"10.1016/j.jbi.2025.104827","url":null,"abstract":"<div><h3>Objective:</h3><div>Considering the radiation hazard of X-ray, safer, more convenient and cost-effective ultrasound methods are gradually becoming new diagnostic approaches for scoliosis. For ultrasound images of spine regions, it is challenging to accurately identify spine regions in images due to relatively small target areas and the presence of a lot of interfering information. Therefore, we developed a novel neural network that incorporates prior knowledge to precisely segment spine regions in ultrasound images.</div></div><div><h3>Materials and methods:</h3><div>We constructed a dataset of ultrasound images of spine regions for semantic segmentation. The dataset contains 3136 images of 30 patients with scoliosis. And we propose a network model (ICPPNet), which fully utilizes inter-class positional prior knowledge by combining an inter-class positional probability heatmap, to achieve accurate segmentation of target areas.</div></div><div><h3>Results:</h3><div>ICPPNet achieved an average Dice similarity coefficient of 70.83<span><math><mtext>%</mtext></math></span> and an average 95<span><math><mtext>%</mtext></math></span> Hausdorff distance of 11.28 mm on the dataset, demonstrating its excellent performance. The average error between the Cobb angle measured by our method and the Cobb angle measured by X-ray images is 1.41 degrees, and the coefficient of determination is 0.9879 with a strong correlation.</div></div><div><h3>Discussion and conclusion:</h3><div>ICPPNet provides a new solution for the medical image segmentation task with positional prior knowledge between target classes. And ICPPNet strongly supports the subsequent reconstruction of spine models using ultrasound images.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"166 ","pages":"Article 104827"},"PeriodicalIF":4.0,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143874964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abel Corrêa Dias, Viviane Pereira Moreira, João Luiz Dihl Comba
{"title":"RoBIn: A Transformer-based model for risk of bias inference with machine reading comprehension","authors":"Abel Corrêa Dias, Viviane Pereira Moreira, João Luiz Dihl Comba","doi":"10.1016/j.jbi.2025.104819","DOIUrl":"10.1016/j.jbi.2025.104819","url":null,"abstract":"<div><h3>Objective:</h3><div>Scientific publications are essential for uncovering insights, testing new drugs, and informing healthcare policies. Evaluating the quality of these publications often involves assessing their Risk of Bias (RoB), a task traditionally performed by human reviewers. The goal of this work is to create a dataset and develop models that allow automated RoB assessment in clinical trials.</div></div><div><h3>Methods:</h3><div>We use data from the Cochrane Database of Systematic Reviews (CDSR) as ground truth to label open-access clinical trial publications from PubMed. This process enabled us to develop training and test datasets specifically for machine reading comprehension and RoB inference. Additionally, we created extractive (RoBIn<sup>Ext</sup>) and generative (RoBIn<sup>Gen</sup>) Transformer-based approaches to extract relevant evidence and classify the RoB effectively.</div></div><div><h3>Results:</h3><div>RoBIn was evaluated across various settings and benchmarked against state-of-the-art methods, including large language models (LLMs). In most cases, the best-performing RoBIn variant surpasses traditional machine learning and LLM-based approaches, achieving a AUROC of 0.83.</div></div><div><h3>Conclusion:</h3><div>This work addresses RoB assessment in clinical trials by introducing RoBIn, two Transformer-based models for RoB inference and evidence retrieval, which outperform traditional models and LLMs, demonstrating its potential to improve efficiency and scalability in clinical research evaluation. We also introduce a public dataset that is automatically annotated and can be used to enable future research to enhance automated RoB assessment.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"166 ","pages":"Article 104819"},"PeriodicalIF":4.0,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143843115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fangwen Zhou , Rick Parrish , Muhammad Afzal , Ashirbani Saha , R. Brian Haynes , Alfonso Iorio , Cynthia Lokker
{"title":"Benchmarking domain-specific pretrained language models to identify the best model for methodological rigor in clinical studies","authors":"Fangwen Zhou , Rick Parrish , Muhammad Afzal , Ashirbani Saha , R. Brian Haynes , Alfonso Iorio , Cynthia Lokker","doi":"10.1016/j.jbi.2025.104825","DOIUrl":"10.1016/j.jbi.2025.104825","url":null,"abstract":"<div><h3>Objective</h3><div>Encoder-only transformer-based language models have shown promise in automating critical appraisal of clinical literature. However, a comprehensive evaluation of the models for classifying the methodological rigor of randomized controlled trials is necessary to identify the more robust ones. This study benchmarks several state-of-the-art transformer-based language models using a diverse set of performance metrics.</div></div><div><h3>Methods</h3><div>Seven transformer-based language models were fine-tuned on the title and abstract of 42,575 articles from 2003 to 2023 in McMaster University’s Premium LiteratUre Service database under different configurations. The studies reported in the articles addressed questions related to treatment, prevention, or quality improvement for which randomized controlled trials are the gold standard with defined criteria for rigorous methods. Models were evaluated on the validation set using 12 schemes and metrics, including optimization for cross-entropy loss, Brier score, AUROC, average precision, sensitivity, specificity, and accuracy, among others. Threshold tuning was performed to optimize threshold-dependent metrics. Models that achieved the best performance in one or more schemes on the validation set were further tested in hold-out and external datasets.</div></div><div><h3>Results</h3><div>A total of 210 models were fine-tuned. Six models achieved top performance in one or more evaluation schemes. Three BioLinkBERT models outperformed others on 8 of the 12 schemes. BioBERT, BiomedBERT, and SciBERT were best on 1, 1 and 2 schemes, respectively. While model performance remained robust on the hold-out test set, it declined in external datasets. Class weight adjustments improved performance in most instances.</div></div><div><h3>Conclusion</h3><div>BioLinkBERT generally outperformed the other models. Using comprehensive evaluation metrics and threshold tuning optimizes model selection for real-world applications. Future work should assess generalizability to other datasets, explore alternate imbalance strategies, and examine training on full-text articles.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"166 ","pages":"Article 104825"},"PeriodicalIF":4.0,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143843116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Silvia Cascianelli, Iva Milojkovic, Marco Masseroli
{"title":"A novel machine learning-based workflow to capture intra-patient heterogeneity through transcriptional multi-label characterization and clinically relevant classification","authors":"Silvia Cascianelli, Iva Milojkovic, Marco Masseroli","doi":"10.1016/j.jbi.2025.104817","DOIUrl":"10.1016/j.jbi.2025.104817","url":null,"abstract":"<div><h3>Objectives:</h3><div>Patient classification into specific molecular subtypes is paramount in biomedical research and clinical practice to face complex, heterogeneous diseases. Existing methods, especially for gene expression-based cancer subtyping, often simplify patient molecular portraits, neglecting the potential co-occurrence of traits from multiple subtypes. Yet, recognizing intra-sample heterogeneity is essential for more precise patient characterization and improved personalized treatments.</div></div><div><h3>Methods:</h3><div>We developed a novel computational workflow, named MULTI-STAR, which addresses current limitations and provides tailored solutions for reliable multi-label patient subtyping. MULTI-STAR uses state-of-the-art subtyping methods to obtain promising machine learning-based multi-label classifiers, leveraging gene expression profiles. It modifies standard single-label similarity-based techniques to obtain multi-label patient characterizations. Then, it employs these characterizations to train single-sample predictors using different multi-label strategies and find the best-performing classifiers.</div></div><div><h3>Results:</h3><div>MULTI-STAR classifiers offer advanced multi-label recognition of all the subtypes contributing to the molecular and clinical traits of a patient, also distinguishing the primary from the additional relevant secondary subtype(s). The efficacy was demonstrated by developing multi-label solutions for breast and colorectal cancer subtyping that outperform existing methods in terms of prognostic value, primarily for overall survival predictions, and ability to work on a single sample at a time, as required in clinical practice.</div></div><div><h3>Conclusions:</h3><div>This work emphasizes the importance of moving to multi-label subtyping to capture all the molecular traits of individual patients, considering also previously overlooked secondary assignments and paving the way for improved clinical decision-making processes in diverse heterogeneous disease contexts. Indeed, MULTI-STAR novel, reproducible and generalizable approach provides comprehensive representations of patient inner heterogeneity and clinically relevant insights, contributing to precision medicine and personalized treatments.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"166 ","pages":"Article 104817"},"PeriodicalIF":4.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143816805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Javier Petri , Pilar Barcena Barbeira , Martina Pesce , Verónica Xhardez , Rodrigo Laje , Viviana Cotik
{"title":"Low-cost algorithms for clinical notes phenotype classification to enhance epidemiological surveillance: A case study","authors":"Javier Petri , Pilar Barcena Barbeira , Martina Pesce , Verónica Xhardez , Rodrigo Laje , Viviana Cotik","doi":"10.1016/j.jbi.2025.104795","DOIUrl":"10.1016/j.jbi.2025.104795","url":null,"abstract":"<div><h3>Objective:</h3><div>Our study aims to enhance epidemic intelligence through event-based surveillance in an emerging pandemic context. We classified electronic health records (EHRs) from La Rioja, Argentina, focusing on predicting COVID-19-related categories in a scenario with limited disease knowledge, evolving symptoms, non-standardized coding practices, and restricted training data due to privacy issues.</div></div><div><h3>Methods:</h3><div>Using natural language processing techniques, we developed rapid, cost-effective methods suitable for implementation with limited resources. We annotated a corpus for training and testing classification models, ranging from simple logistic regression to more complex fine-tuned transformers.</div></div><div><h3>Results:</h3><div>The transformer-based, Spanish-adapted models BETO Clínico and RoBERTa Clínico, further pre-trained with an unannotated portion of our corpus, were the best-performing models (F1= 88.13% and 87.01%). A simple logistic regression (LR) model ranked third (F1=85.09%), outperforming more complex models like XGBoost and BiLSTM. Data classified as COVID-confirmed using LR and BETO Clínico exhibit stronger time-series Pearson correlation with official COVID-19 case counts from the National Health Surveillance System (SNVS 2.0) in La Rioja province compared to the correlations observed between the International Code of Diseases (ICD-10) codes and the SNVS 2.0 data (0.840, 0.873, and 0.663, p-values <span><math><mrow><mo>≤</mo><mn>3</mn><mo>×</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mo>−</mo><mn>7</mn></mrow></msup></mrow></math></span>). Both models have a good Pearson correlation with ICD-10 codes assigned to the clinical notes for confirmed (0.940 and 0.902) and for suspected cases (0.960 and 0.954), p-values <span><math><mrow><mo>≤</mo><mn>1</mn><mo>.</mo><mn>7</mn><mo>×</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mo>−</mo><mn>18</mn></mrow></msup></mrow></math></span>.</div></div><div><h3>Conclusion:</h3><div>This study shows that simple, resource-efficient methods can achieve results comparable to complex approaches. BETO Clínico and LR strongly correlate with official data, revealing uncoded confirmed cases at the pandemic’s onset. Our results suggest that annotating a smaller set of EHRs and training a simple model may be more cost-effective than manual coding. This points to potentially efficient strategies in public health emergencies, particularly in resource-limited settings, and provides valuable insights for future epidemic response efforts.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"166 ","pages":"Article 104795"},"PeriodicalIF":4.0,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143833466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francisco J. Lara-Abelenda , David Chushig-Muzo , Pablo Peiro-Corbacho , Vanesa Gómez-Martínez , Ana M. Wägner , Conceição Granja , Cristina Soguero-Ruiz
{"title":"Transfer learning for a tabular-to-image approach: A case study for cardiovascular disease prediction","authors":"Francisco J. Lara-Abelenda , David Chushig-Muzo , Pablo Peiro-Corbacho , Vanesa Gómez-Martínez , Ana M. Wägner , Conceição Granja , Cristina Soguero-Ruiz","doi":"10.1016/j.jbi.2025.104821","DOIUrl":"10.1016/j.jbi.2025.104821","url":null,"abstract":"<div><h3>Objective:</h3><div>Machine learning (ML) models have been extensively used for tabular data classification but recent works have been developed to transform tabular data into images, aiming to leverage the predictive performance of convolutional neural networks (CNNs). However, most of these approaches fail to convert data with a low number of samples and mixed-type features. This study aims: to evaluate the performance of the tabular-to-image method named low mixed-image generator for tabular data (LM-IGTD); and to assess the effectiveness of transfer learning and fine-tuning for improving predictions on tabular data.</div></div><div><h3>Methods:</h3><div>We employed two public tabular datasets with patients diagnosed with cardiovascular diseases (CVDs): Framingham and Steno. First, both datasets were transformed into images using LM-IGTD. Then, Framingham, which contains a larger set of samples than Steno, is used to train CNN-based models. Finally, we performed transfer learning and fine-tuning using the pre-trained CNN on the Steno dataset to predict CVD risk.</div></div><div><h3>Results:</h3><div>The CNN-based model with transfer learning achieved the highest AUCORC in Steno (0.855), outperforming ML models such as decision trees, K-nearest neighbours, least absolute shrinkage and selection operator (LASSO) support vector machine and TabPFN. This approach improved accuracy by 2% over the best-performing traditional model, TabPFN.</div></div><div><h3>Conclusion:</h3><div>To the best of our knowledge, this is the first study that evaluates the effectiveness of applying transfer learning and fine-tuning to tabular data using tabular-to-image approaches. Through the use of CNNs’ predictive capabilities, our work also advances the diagnosis of CVD by providing a framework for early clinical intervention and decision-making support.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"165 ","pages":"Article 104821"},"PeriodicalIF":4.0,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143799192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Moxuan Ma , Muyu Wang , Lan Wei , Xiaolu Fei , Hui Chen
{"title":"Multi-modal fusion model for Time-Varying medical Data: Addressing Long-Term dependencies and memory challenges in sequence fusion","authors":"Moxuan Ma , Muyu Wang , Lan Wei , Xiaolu Fei , Hui Chen","doi":"10.1016/j.jbi.2025.104823","DOIUrl":"10.1016/j.jbi.2025.104823","url":null,"abstract":"<div><h3>Background</h3><div>Multi-modal time-varying data continuously generated during a patient’s hospitalization reflects the patient’s disease progression. Certain patient conditions may be associated with long-term states, which is a weakness of current medical multi-modal time-varying data fusion models. Daily ward round notes, as time-series long texts, are often neglected by models.</div></div><div><h3>Objective</h3><div>This study aims to develop an effective medical multi-modal time-varying data fusion model capable of extracting features from long sequences and long texts while capturing long-term dependencies.</div></div><div><h3>Methods</h3><div>We proposed a model called medical multi-modal fusion for long-term dependencies (MMF-LD) that fuses time-varying and time-invariant, tabular, and textual data. A progressive multi-modal fusion (PMF) strategy was introduced to address information loss in multi-modal time series fusion, particularly for long time-varying texts. With the integration of the attention mechanism, the long short-term storage memory (LSTsM) gained enhanced capacity to extract long-term dependencies. In conjunction with the temporal convolutional network (TCN), it extracted long-term features from time-varying sequences without neglecting the local contextual information of the time series. Model performance was evaluated on acute myocardial infarction (AMI) and stroke datasets for in-hospital mortality risk prediction and long length-of-stay prediction. area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), and F1 score were used as evaluation metrics for model performance.</div></div><div><h3>Results</h3><div>The MMF-LD model demonstrated superior performance compared to other multi-modal time-varying data fusion models in model comparison experiments (AUROC: 0.947 and 0.918 in the AMI dataset, and 0.965 and 0.868 in the stroke dataset; AUPRC: 0.410 and 0.675, and 0.467 and 0.533; F1 score: 0.658 and 0.513, and 0.684 and 0.401). Ablation experiments confirmed that the proposed PMF strategy, LSTsM, and TCN modules all contributed to performance improvements as intended.</div></div><div><h3>Conclusions</h3><div>The proposed medical multi-modal time-varying data fusion architecture addresses the challenge of forgetting time-varying long textual information in time series fusion. It exhibits stable performance across multiple datasets and tasks. It exhibits strength in capturing long-term dependencies and shows stable performance across multiple datasets and tasks.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"165 ","pages":"Article 104823"},"PeriodicalIF":4.0,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kory Kreimeyer , Jonathan Spiker , Oanh Dang , Suranjan De , Robert Ball , Taxiarchis Botsis
{"title":"Deduplicating the FDA adverse event reporting system with a novel application of network-based grouping","authors":"Kory Kreimeyer , Jonathan Spiker , Oanh Dang , Suranjan De , Robert Ball , Taxiarchis Botsis","doi":"10.1016/j.jbi.2025.104824","DOIUrl":"10.1016/j.jbi.2025.104824","url":null,"abstract":"<div><h3>Objective</h3><div>To improve the reliability of data mining for product safety concerns in the Food and Drug Administration’s (FDA) Adverse Event Reporting System (FAERS) by robustly identifying duplicate reports describing the same patient experience.</div></div><div><h3>Materials and methods</h3><div>A duplicate detection algorithm based on a probabilistic record linkage algorithm, including features extracted from report narratives, and designed to support FAERS case safety review as part of the Information Visualization Platform (InfoViP) has been upgraded into a full deduplication pipeline for the entire FAERS database. The pipeline contains several new and updated components, including a network analysis-based community detection routine for breaking up sparsely connected groups of duplicates constructed from chains of pairwise comparisons. The pipeline was applied to all 29 million FAERS reports to assemble groups of duplicate cases.</div></div><div><h3>Results</h3><div>The pipeline was evaluated on 12 human expert adjudicated data sets with a total of 2300 reports and was found to have better overall performance than the current tool used at the FDA for labeling duplicates on 10 of them, with F1 scores ranging from 0.36 to 0.93, with half above 0.75. Because minimizing false discovery increases human expert review efficiency, the improved deduplication pipeline was applied to all historic and daily incoming FAERS reports at FDA and identified about 5 million reports as duplicates.</div></div><div><h3>Conclusions</h3><div>The InfoViP deduplication pipeline is operating at FDA to identify duplicate case reports in FAERS and provide deduplicated input for improved efficiency and accuracy of safety review operations like adverse event data mining calculations.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"165 ","pages":"Article 104824"},"PeriodicalIF":4.0,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143777390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haihua Chen , Ruochi Li , Ana Cleveland , Junhua Ding
{"title":"Enhancing data quality in medical concept normalization through large language models","authors":"Haihua Chen , Ruochi Li , Ana Cleveland , Junhua Ding","doi":"10.1016/j.jbi.2025.104812","DOIUrl":"10.1016/j.jbi.2025.104812","url":null,"abstract":"<div><h3>Objective:</h3><div>Medical concept normalization (MCN) aims to map informal medical terms to formal medical concepts, a critical task in building machine learning systems for medical applications. However, most existing studies on MCN primarily focus on models and algorithms, often overlooking the vital role of data quality. This research evaluates MCN performance across varying data quality scenarios and investigates how to leverage these evaluation results to enhance data quality, ultimately improving MCN performance through the use of large language models (LLMs). The effectiveness of the proposed approach is demonstrated through a case study.</div></div><div><h3>Methods:</h3><div>We begin by conducting a data quality evaluation of a dataset used for MCN. Based on these findings, we employ ChatGPT-based zero-shot prompting for data augmentation. The quality of the generated data is then assessed across the dimensions of correctness and comprehensiveness. A series of experiments is performed to analyze the impact of data quality on MCN model performance. These results guide us in implementing LLM-based few-shot prompting to further enhance data quality and improve model performance.</div></div><div><h3>Results:</h3><div>Duplication of data items within a dataset can lead to inaccurate evaluation results. Data augmentation techniques such as zero-shot and few-shot learning with ChatGPT can introduce duplicated data items, particularly those in the mean region of a dataset’s distribution. As such, data augmentation strategies must be carefully designed, incorporating context information and training data to avoid these issues. Additionally, we found that including augmented data in the testing set is necessary to fairly evaluate the effectiveness of data augmentation strategies.</div></div><div><h3>Conclusion:</h3><div>While LLMs can generate high-quality data for MCN, the success of data augmentation depends heavily on the strategy employed. Our study found that few-shot learning, with prompts that incorporate appropriate context and a small, representative set of original data, is an effective approach. The methods developed in this research, including the data quality evaluation framework, LLM-based data augmentation strategies, and procedures for data quality enhancement, provide valuable insights for data augmentation and evaluation in similar deep learning applications.</div></div><div><h3>Availability:</h3><div><span><span>https://github.com/RichardLRC/mcn-data-quality-llm/tree/main/evaluation</span><svg><path></path></svg></span></div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"165 ","pages":"Article 104812"},"PeriodicalIF":4.0,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143777389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manqi Zhou , Alice S. Tang , Hao Zhang , Zhenxing Xu , Alison M.C. Ke , Chang Su , Yu Huang , William G. Mantyh , Michael S. Jaffee , Katherine P. Rankin , Steven T. DeKosky , Jiayu Zhou , Yi Guo , Jiang Bian , Marina Sirota , Fei Wang
{"title":"Identifying progression subphenotypes of Alzheimer’s disease from large-scale electronic health records with machine learning","authors":"Manqi Zhou , Alice S. Tang , Hao Zhang , Zhenxing Xu , Alison M.C. Ke , Chang Su , Yu Huang , William G. Mantyh , Michael S. Jaffee , Katherine P. Rankin , Steven T. DeKosky , Jiayu Zhou , Yi Guo , Jiang Bian , Marina Sirota , Fei Wang","doi":"10.1016/j.jbi.2025.104820","DOIUrl":"10.1016/j.jbi.2025.104820","url":null,"abstract":"<div><h3>Objective</h3><div>Identification of clinically meaningful subphenotypes of disease progression can enhance the understanding of disease heterogeneity and underlying pathophysiology. In this study, we propose a machine learning framework to identify subphenotypes of Alzheimer’s disease progression based on longitudinal real-world patient records.</div></div><div><h3>Methods</h3><div>The framework, dynaPhenoM, extracts coherent clinical topics across patient visits and employs a time-aware latent class analysis to characterize subphenotypes. We validated dynaPhenoM using three patient databases with a total of 3952 AD patients across the United States, demonstrating its effectiveness in revealing mild cognitive impairment (MCI) progression to AD.</div></div><div><h3>Results</h3><div>Our study identified five subphenotypes associated with distinct organ systems for disease progression from MCI to AD, including common subtypes across cohorts—respiratory, musculoskeletal, cardiovascular, and endocrine/metabolic—as well as a cohort-specific digestive subtype.</div></div><div><h3>Conclusion</h3><div>Our study unravels the complexity and heterogeneity of the progression from MCI to AD. These findings highlight disease progression heterogeneity and can inform both diagnostic and therapeutic strategies, thereby advancing precision medicine for Alzheimer’s disease.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"165 ","pages":"Article 104820"},"PeriodicalIF":4.0,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143777391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}