Laura Brandt, Larry Au, Clinton Castro, Gabriel J. Odom
{"title":"Engaging an advisory board in discussions about the ethical relevance of algorithmic bias and fairness","authors":"Laura Brandt, Larry Au, Clinton Castro, Gabriel J. Odom","doi":"10.1038/s41746-025-01711-1","DOIUrl":"https://doi.org/10.1038/s41746-025-01711-1","url":null,"abstract":"<p>We are an interdisciplinary team of researchers that are working to advance algorithmic fairness in the research of opioid use disorders. We discuss challenges that our research team faced when engaging with our Advisory Board, as well as several strategies that we came up with to help us find a common language to ensure semantic transparency and to ensure the thick alignment with values of affected parties.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"127 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144083175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zikang Xu, Fenghe Tang, Quan Quan, Qingsong Yao, Qingpeng Kong, Jianrui Ding, Chunping Ning, S. Kevin Zhou
{"title":"Fair ultrasound diagnosis via adversarial protected attribute aware perturbations on latent embeddings","authors":"Zikang Xu, Fenghe Tang, Quan Quan, Qingsong Yao, Qingpeng Kong, Jianrui Ding, Chunping Ning, S. Kevin Zhou","doi":"10.1038/s41746-025-01641-y","DOIUrl":"https://doi.org/10.1038/s41746-025-01641-y","url":null,"abstract":"<p>Deep learning techniques have significantly enhanced the convenience and precision of ultrasound image diagnosis, particularly in the crucial step of lesion segmentation. However, recent studies reveal that both train-from-scratch models and pre-trained models often exhibit performance disparities across sex and age attributes, leading to biased diagnoses for different subgroups. In this paper, we propose <b>APPLE</b>, a novel approach designed to mitigate unfairness without altering the parameters of the base model. APPLE achieves this by learning fair perturbations in the latent space through a generative adversarial network. Extensive experiments on both a publicly available dataset and an in-house ultrasound image dataset demonstrate that our method improves segmentation and diagnostic fairness across all sensitive attributes and various backbone architectures compared to the base models. Through this study, we aim to highlight the critical importance of fairness in medical segmentation and contribute to the development of a more equitable healthcare system.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"30 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144067150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
He Zhu, Jun Bai, Na Li, Xiaoxiao Li, Dianbo Liu, David L. Buckeridge, Yue Li
{"title":"FedWeight: mitigating covariate shift of federated learning on electronic health records data through patients re-weighting","authors":"He Zhu, Jun Bai, Na Li, Xiaoxiao Li, Dianbo Liu, David L. Buckeridge, Yue Li","doi":"10.1038/s41746-025-01661-8","DOIUrl":"https://doi.org/10.1038/s41746-025-01661-8","url":null,"abstract":"<p>Federated learning (FL) enables collaborative analysis of decentralized medical data while preserving patient privacy. However, the covariate shift from demographic and clinical differences can reduce model generalizability. We propose FedWeight, a novel FL framework that mitigates covariate shift by reweighting patient data from the source sites using density estimators, allowing the trained model to better align with the distribution of the target site. To support unsupervised applications, we introduce FedWeight ETM, a federated embedded topic model. We evaluated FedWeight in cross-site FL on the eICU dataset and cross-dataset FL between eICU and MIMIC III. FedWeight consistently outperforms standard FL baselines in predicting ICU mortality, ventilator use, sepsis diagnosis, and length of stay. SHAP-based interpretation and ETM-based topic modeling reveal improved identification of clinically relevant characteristics and disease topics associated with ICU readmission.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"205 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144067149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ethan Phillips, Odhran O’Donoghue, Yumeng Zhang, Panos Tsimpos, Leigh Ann Mallinger, Stefanos Chatzidakis, Jack Pohlmann, Yili Du, Ivy Kim, Jonathan Song, Benjamin Brush, Stelios Smirnakis, Charlene J. Ong, Agni Orfanoudaki
{"title":"Hybrid machine learning for real-time prediction of edema trajectory in large middle cerebral artery stroke","authors":"Ethan Phillips, Odhran O’Donoghue, Yumeng Zhang, Panos Tsimpos, Leigh Ann Mallinger, Stefanos Chatzidakis, Jack Pohlmann, Yili Du, Ivy Kim, Jonathan Song, Benjamin Brush, Stelios Smirnakis, Charlene J. Ong, Agni Orfanoudaki","doi":"10.1038/s41746-025-01687-y","DOIUrl":"https://doi.org/10.1038/s41746-025-01687-y","url":null,"abstract":"<p>In treating malignant cerebral edema after a large middle cerebral artery stroke, clinicians need quantitative tools for real-time risk assessment. Existing predictive models typically estimate risk at one, early time point, failing to account for dynamic variables. To address this, we developed Hybrid Ensemble Learning Models for Edema Trajectory (HELMET) to predict midline shift severity, an established indicator of malignant edema, over 8-h and 24-h windows. The HELMET models were trained on retrospective data from 623 patients and validated on 63 patients from a different hospital system, achieving mean areas under the receiver operating characteristic curve of 96.6% and 92.5%, respectively. By integrating transformer-based large language models with supervised ensemble learning, HELMET demonstrates the value of combining clinician expertise with multimodal health records in assessing patient risk. Our approach provides a framework for accurate, real-time estimation of dynamic clinical targets using human-curated and algorithm-derived inputs.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"9 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144067146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joeran S. Bosma, Koen Dercksen, Luc Builtjes, Romain André, Christian Roest, Stefan J. Fransen, Constant R. Noordman, Mar Navarro-Padilla, Judith Lefkes, Natália Alves, Max J. J. de Grauw, Leander van Eekelen, Joey M. A. Spronck, Megan Schuurmans, Bram de Wilde, Ward Hendrix, Witali Aswolinskiy, Anindo Saha, Jasper J. Twilt, Daan Geijs, Jeroen Veltman, Derya Yakar, Maarten de Rooij, Francesco Ciompi, Alessa Hering, Jeroen Geerdink, Henkjan Huisman
{"title":"The DRAGON benchmark for clinical NLP","authors":"Joeran S. Bosma, Koen Dercksen, Luc Builtjes, Romain André, Christian Roest, Stefan J. Fransen, Constant R. Noordman, Mar Navarro-Padilla, Judith Lefkes, Natália Alves, Max J. J. de Grauw, Leander van Eekelen, Joey M. A. Spronck, Megan Schuurmans, Bram de Wilde, Ward Hendrix, Witali Aswolinskiy, Anindo Saha, Jasper J. Twilt, Daan Geijs, Jeroen Veltman, Derya Yakar, Maarten de Rooij, Francesco Ciompi, Alessa Hering, Jeroen Geerdink, Henkjan Huisman","doi":"10.1038/s41746-025-01626-x","DOIUrl":"https://doi.org/10.1038/s41746-025-01626-x","url":null,"abstract":"<p>Artificial Intelligence can mitigate the global shortage of medical diagnostic personnel but requires large-scale annotated datasets to train clinical algorithms. Natural Language Processing (NLP), including Large Language Models (LLMs), shows great potential for annotating clinical data to facilitate algorithm development but remains underexplored due to a lack of public benchmarks. This study introduces the DRAGON challenge, a benchmark for clinical NLP with 28 tasks and 28,824 annotated medical reports from five Dutch care centers. It facilitates automated, large-scale, cost-effective data annotation. Foundational LLMs were pretrained using four million clinical reports from a sixth Dutch care center. Evaluations showed the superiority of domain-specific pretraining (DRAGON 2025 test score of 0.770) and mixed-domain pretraining (0.756), compared to general-domain pretraining (0.734, <i>p</i> < 0.005). While strong performance was achieved on 18/28 tasks, performance was subpar on 10/28 tasks, uncovering where innovations are needed. Benchmark, code, and foundational LLMs are publicly available.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"57 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144067148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Supreeth P. Shashikumar, Sina Mohammadi, Rishivardhan Krishnamoorthy, Avi Patel, Gabriel Wardi, Joseph C. Ahn, Karandeep Singh, Eliah Aronoff-Spencer, Shamim Nemati
{"title":"Development and prospective implementation of a large language model based system for early sepsis prediction","authors":"Supreeth P. Shashikumar, Sina Mohammadi, Rishivardhan Krishnamoorthy, Avi Patel, Gabriel Wardi, Joseph C. Ahn, Karandeep Singh, Eliah Aronoff-Spencer, Shamim Nemati","doi":"10.1038/s41746-025-01689-w","DOIUrl":"https://doi.org/10.1038/s41746-025-01689-w","url":null,"abstract":"<p>Sepsis is a dysregulated host response to infection with high mortality and morbidity. Early detection and intervention have been shown to improve patient outcomes, but existing computational models relying on structured electronic health record data often miss contextual information from unstructured clinical notes. This study introduces COMPOSER-LLM, an open-source large language model (LLM) integrated with the COMPOSER model to enhance early sepsis prediction. For high-uncertainty predictions, the LLM extracts additional context to assess sepsis-mimics, improving accuracy. Evaluated on 2500 patient encounters, COMPOSER-LLM achieved a sensitivity of 72.1%, positive predictive value of 52.9%, F-1 score of 61.0%, and 0.0087 false alarms per patient hour, outperforming the standalone COMPOSER model. Prospective validation yielded similar results. Manual chart review found 62% of false positives had bacterial infections, demonstrating potential clinical utility. Our findings suggest that integrating LLMs with traditional models can enhance predictive performance by leveraging unstructured data, representing a significant advance in healthcare analytics.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"122 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144067153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vipina K. Keloth, Salih Selek, Qingyu Chen, Christopher Gilman, Sunyang Fu, Yifang Dang, Xinghan Chen, Xinyue Hu, Yujia Zhou, Huan He, Jungwei W. Fan, Karen Wang, Cynthia Brandt, Cui Tao, Hongfang Liu, Hua Xu
{"title":"Social determinants of health extraction from clinical notes across institutions using large language models","authors":"Vipina K. Keloth, Salih Selek, Qingyu Chen, Christopher Gilman, Sunyang Fu, Yifang Dang, Xinghan Chen, Xinyue Hu, Yujia Zhou, Huan He, Jungwei W. Fan, Karen Wang, Cynthia Brandt, Cui Tao, Hongfang Liu, Hua Xu","doi":"10.1038/s41746-025-01645-8","DOIUrl":"https://doi.org/10.1038/s41746-025-01645-8","url":null,"abstract":"<p>Detailed social determinants of health (SDoH) is often buried within clinical text in EHRs. Most current NLP efforts for SDoH have limitations, investigating limited factors, deriving data from a single institution, using specific patient cohorts/note types, with reduced focus on generalizability. We aim to address these issues by creating cross-institutional corpora and developing and evaluating the generalizability of classification models, including large language models (LLMs), for detecting SDoH factors using data from four institutions. Clinical notes were annotated with 21 SDoH factors at two levels: level 1 (SDoH factors only) and level 2 (SDoH factors and associated values). Compared to other models, instruction tuned LLM achieved top performance with micro-averaged F1 over 0.9 on level 1 corpora and over 0.84 on level 2 corpora. While models performed well when trained and tested on individual datasets, cross-dataset generalization highlighted remaining obstacles. Access to trained models will be made available at https://github.com/BIDS-Xu-Lab/LLMs4SDoH.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"41 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144067159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Delaney A. Smith, Adam Lavertu, Aadesh Salecha, Tymor Hamamsy, Keith Humphreys, Anna Lembke, Mathew V. Kiang, Russ B. Altman, Johannes C. Eichstaedt
{"title":"Monitoring the opioid epidemic via social media discussions","authors":"Delaney A. Smith, Adam Lavertu, Aadesh Salecha, Tymor Hamamsy, Keith Humphreys, Anna Lembke, Mathew V. Kiang, Russ B. Altman, Johannes C. Eichstaedt","doi":"10.1038/s41746-025-01642-x","DOIUrl":"https://doi.org/10.1038/s41746-025-01642-x","url":null,"abstract":"<p>The opioid epidemic persists in the U.S., with over 80,000 deaths annually since 2021, primarily driven by synthetic opioids. Responding to this evolving epidemic requires reliable and timely information. One source of data is social media platforms. We assessed the utility of Reddit data for surveillance, covering heroin, prescription, and synthetic drugs. We built a natural language processing pipeline to identify opioid-related content and created a cohort of 1,689,039 Reddit users, each assigned to a state based on their previous Reddit activity. We measured their opioid-related posts over time and compared rates against CDC overdose and NFLIS report rates. To simulate the real-world prediction of synthetic opioid overdose rates, we added near real-time Reddit data to a model relying on CDC mortality data with a typical 6-month reporting lag. Reddit data significantly improved the prediction accuracy of overdose rates. This work suggests that social media can help monitor drug epidemics.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"28 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143979388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fleur E. Marijnissen, Elyse E. C. Rijnders, Merel M. Tielemans, Désiree van Noord, Leonieke M. M. Wolters, Jeroen M. Jansen, Ingrid Schot, Frank C. Bekkering, Agnes N. Reijm, Sophia M. van Baalen, Tingting Wang, Marijke Melles, Richard Goossens, Sohal Y. Ismail, Iris Lansdorp–Vogelaar, Pieter Jan F. de Jonge, Manon C. W. Spaander
{"title":"Reducing outpatient visits for FIT-positive participants of colorectal cancer screening programs with home-based digital counselling","authors":"Fleur E. Marijnissen, Elyse E. C. Rijnders, Merel M. Tielemans, Désiree van Noord, Leonieke M. M. Wolters, Jeroen M. Jansen, Ingrid Schot, Frank C. Bekkering, Agnes N. Reijm, Sophia M. van Baalen, Tingting Wang, Marijke Melles, Richard Goossens, Sohal Y. Ismail, Iris Lansdorp–Vogelaar, Pieter Jan F. de Jonge, Manon C. W. Spaander","doi":"10.1038/s41746-025-01683-2","DOIUrl":"https://doi.org/10.1038/s41746-025-01683-2","url":null,"abstract":"<p>Digital counselling can alleviate the burden on healthcare systems and patients. While it has been evaluated as a supplement to standard care or a substitute for follow-up visits, its use for initial triaging and counselling remains unstudied. We developed a Digital Intake Tool (DIT) to facilitate the entire pre-colonoscopy counselling process for FIT-positive participants of a colorectal cancer screening program digitally, replacing the need for physicians. In this multicentre prospective non-inferiority study, we evaluated if the DIT could replace in-person counselling. DIT-counselling resulted in adequately prepared participants in 96.5%, compared to 97.6% after in-person counselling, demonstrating non-inferiority. Outpatient visits were significantly reduced, with only 3.4% requiring face-to-face consultations. Patient experiences were highly positive, without increased psychological distress or anxiety, and effective knowledge transfer. This approach benefits patients and healthcare systems, allowing patients to receive care at home, reducing travel and carbon emissions, while increasing outpatient capacity. ICTRP-registration: NL9315, March 8, 2021.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"52 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143979463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Schwappach, Wolf Hautz, Gert Krummrey, Yvonne Pfeiffer, Raj M. Ratwani
{"title":"EMR usability and patient safety: a national survey of physicians","authors":"David Schwappach, Wolf Hautz, Gert Krummrey, Yvonne Pfeiffer, Raj M. Ratwani","doi":"10.1038/s41746-025-01657-4","DOIUrl":"https://doi.org/10.1038/s41746-025-01657-4","url":null,"abstract":"<p>Despite widespread adoption of electronic medical records (EMRs), concerns persist regarding their usability and implications for patient safety. This national cross-sectional survey assessed physicians’ perceptions of EMR usability across safety-relevant domains. Among 1933 respondents from diverse care settings, 56% reported that their EMR did not enhance patient safety, and 50% perceived their system as inefficient. Usability ratings averaged 52% of the maximum score. Statistically significant differences were observed between EMRs in outpatient (η² = 0.13) and hospital (η² = 0.37) settings. Multilevel modeling attributed 38% of the variance in usability ratings to differences between EMRs, 51% to hospital-level variation within EMRs, and 11% to physician-level differences. Canonical discriminant analysis identified key differentiating usability features, including system response times, excessive alerts, prevention of data entry errors, and support for collaboration. These findings underscore substantial limitations in current EMR systems and reinforce the value of comparative usability assessments to inform targeted improvements in digital health infrastructure.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"29 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143979394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}