Farieda Gaber, Maqsood Shaik, Fabio Allega, Agnes Julia Bilecz, Felix Busch, Kelsey Goon, Vedran Franke, Altuna Akalin
{"title":"Evaluating large language model workflows in clinical decision support for triage and referral and diagnosis","authors":"Farieda Gaber, Maqsood Shaik, Fabio Allega, Agnes Julia Bilecz, Felix Busch, Kelsey Goon, Vedran Franke, Altuna Akalin","doi":"10.1038/s41746-025-01684-1","DOIUrl":"https://doi.org/10.1038/s41746-025-01684-1","url":null,"abstract":"<p>Accurate medical decision-making is critical for both patients and clinicians. Patients often struggle to interpret their symptoms, determine their severity, and select the right specialist. Simultaneously, clinicians face challenges in integrating complex patient data to make timely, accurate diagnoses. Recent advances in large language models (LLMs) offer the potential to bridge this gap by supporting decision-making for both patients and healthcare providers. In this study, we benchmark multiple LLM versions and an LLM-based workflow incorporating retrieval-augmented generation (RAG) on a curated dataset of 2000 medical cases derived from the Medical Information Mart for Intensive Care database. Our findings show that these LLMs are capable of providing personalized insights into likely diagnoses, suggesting appropriate specialists, and assessing urgent care needs. These models may also support clinicians in refining diagnoses and decision-making, offering a promising approach to improving patient outcomes and streamlining healthcare delivery.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"38 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143927321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haiko Schurz, Klara Solander, Davida Åström, Fernando Cossío, Taeyang Choi, Magnus Dustler, Claes Lundström, Håkan Gustafsson, Sophia Zackrisson, Fredrik Strand
{"title":"Simulating mismatch between calibration and target population in AI for mammography the retrospective VAIB study","authors":"Haiko Schurz, Klara Solander, Davida Åström, Fernando Cossío, Taeyang Choi, Magnus Dustler, Claes Lundström, Håkan Gustafsson, Sophia Zackrisson, Fredrik Strand","doi":"10.1038/s41746-025-01623-0","DOIUrl":"https://doi.org/10.1038/s41746-025-01623-0","url":null,"abstract":"<p>AI cancer detection models require calibration to attain the desired balance between cancer detection rate (CDR) and false positive rate. In this study, we simulate the impact of six types of mismatches between the calibration population and the clinical target population, by creating purposefully non-representative datasets to calibrate AI for clinical settings. Mismatching the acquisition year between healthy and cancer-diagnosed screening participants led to a distortion in CDR between −3% to +19%. Mismatching age led to a distortion in CDR between −0.2% to +27%. Mismatching breast density distribution led to a distortion in CDR between +1% to 16%. Mismatching mammography vendors lead to a distortion in CDR between −32% to + 33%. Mismatches between calibration population and target clinical population lead to clinically important deviations. It is vital for safe clinical AI integration to ensure that important aspects of the calibration population are representative of the target population.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"3 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143920469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chiara Carboni, Celia Brightwell, Orit Halpern, Oscar Freyer, Stephen Gilbert
{"title":"Reconciling security and care in digital medicine","authors":"Chiara Carboni, Celia Brightwell, Orit Halpern, Oscar Freyer, Stephen Gilbert","doi":"10.1038/s41746-025-01685-0","DOIUrl":"https://doi.org/10.1038/s41746-025-01685-0","url":null,"abstract":"<p>Common approaches to cybersecurity frame end users as the weakest link in a system. In this Perspective, we argue that appraising end users as contributors to security can help reconcile both security and care practices. Through two case studies, including Swedish hospitals and a UK smart-home system, we demonstrate the gaps between established security protocols and the daily reality of care practices, and outline approaches to reconcile security and care.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"21 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143920464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Avisha Das, Ish A. Talati, Juan Manuel Zambrano Chaves, Daniel Rubin, Imon Banerjee
{"title":"Weakly supervised language models for automated extraction of critical findings from radiology reports","authors":"Avisha Das, Ish A. Talati, Juan Manuel Zambrano Chaves, Daniel Rubin, Imon Banerjee","doi":"10.1038/s41746-025-01522-4","DOIUrl":"https://doi.org/10.1038/s41746-025-01522-4","url":null,"abstract":"<p>Critical findings in radiology reports are life threatening conditions that need to be communicated promptly to physicians for timely management of patients. Although challenging, advancements in natural language processing (NLP), particularly large language models (LLMs), now enable the automated identification of key findings from verbose reports. Given the scarcity of labeled critical findings data, we implemented a two-phase, weakly supervised fine-tuning approach on 15,000 unlabeled Mayo Clinic reports. This fine-tuned model then automatically extracted critical terms on internal (Mayo Clinic, <i>n</i> = 80) and external (MIMIC-III, <i>n</i> = 123) test datasets, validated against expert annotations. Model performance was further assessed on 5000 MIMIC-IV reports using LLM-aided metrics, G-eval and Prometheus. Both manual and LLM-based evaluations showed improved task alignment with weak supervision. The pipeline and model, publicly available under an academic license, can aid in critical finding extraction for research and clinical use (https://github.com/dasavisha/CriticalFindings_Extract).</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"26 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143920579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Patient agency and large language models in worldwide encoding of equity","authors":"Antonis A. Armoundas, Joseph Loscalzo","doi":"10.1038/s41746-025-01598-y","DOIUrl":"https://doi.org/10.1038/s41746-025-01598-y","url":null,"abstract":"<p>Large language models progressively result in improved ways of patient engagement and access to healthcare, reaching both an exciting and concerning time, as they no longer serve solely as a guide to clinicians, but, for the first time enable patients to make decisions that directly affect their health. We present the benefits and risks of this paradigm-shift in the practice of medicine, that offers the possibility of promoting health equity.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"25 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143920467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michele Merler, Carla Agurto, Julian Peller, Esteban Roitberg, Alan Taitz, Marcos A. Trevisan, Indu Navar, James D. Berry, Ernest Fraenkel, Lyle W. Ostrow, Guillermo A. Cecchi, Raquel Norel
{"title":"Clinical assessment and interpretation of dysarthria in ALS using attention based deep learning AI models","authors":"Michele Merler, Carla Agurto, Julian Peller, Esteban Roitberg, Alan Taitz, Marcos A. Trevisan, Indu Navar, James D. Berry, Ernest Fraenkel, Lyle W. Ostrow, Guillermo A. Cecchi, Raquel Norel","doi":"10.1038/s41746-025-01654-7","DOIUrl":"https://doi.org/10.1038/s41746-025-01654-7","url":null,"abstract":"<p>Speech dysarthria is a key symptom of neurological conditions like ALS, yet existing AI models designed to analyze it from audio signal rely on handcrafted features with limited inference performance. Deep learning approaches improve accuracy but lack interpretability. We propose an attention-based deep learning AI model to assess dysarthria severity based on listener effort ratings. Using 2,102 recordings from 125 participants, rated by three speech-language pathologists on a 100-point scale, we trained models directly from recordings collected remotely. Our best model achieved R<sup>2</sup> of 0.92 and RMSE of 6.78. Attention-based interpretability identified key phonemes, such as vowel sounds influenced by ‘r’ (e.g., “car,” “more”), and isolated inspiration sounds as markers of speech deterioration. This model enhances precision in dysarthria assessment while maintaining clinical interpretability. By improving sensitivity to subtle speech changes, it offers a valuable tool for research and patient care in ALS and other neurological disorders.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"32 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143920468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Kwaro, Stefan Mendt, Julius Okoth, Stephen Munga, Hanns-Christian Gunga, Zoë Hannah Heim, Ina Matzke, Aditi Bunker, Sandra Barteit, Martina Anna Maggioni
{"title":"Acceptability and feasibility of research grade wearables for monitoring heat stress in Kenyan farmers","authors":"Daniel Kwaro, Stefan Mendt, Julius Okoth, Stephen Munga, Hanns-Christian Gunga, Zoë Hannah Heim, Ina Matzke, Aditi Bunker, Sandra Barteit, Martina Anna Maggioni","doi":"10.1038/s41746-025-01601-6","DOIUrl":"https://doi.org/10.1038/s41746-025-01601-6","url":null,"abstract":"<p>Sub-Saharan Africa faces increasing heat events due to climate change, affecting health and productivity. Wearable technology, though promising for monitoring these impacts, is underexplored in this region. This pilot study evaluated the acceptability and feasibility of research-grade wearables for monitoring heat stress among Kenyan subsistence farmers. In Siaya, 48 farmers (50% women) were monitored for 14 days using sensors to measure heart rate, core temperature, sleep, activity, and geo-location, alongside environmental data loggers for wet bulb globe temperature. Participants mostly rated their experience on a 5-point Likert scale and provided additional non-Likert feedback, with over 95% reporting high device likability and minimal disruption. Data availability was 88% for actigraphy and 100% for core temperature, with a median completeness of 100% for most devices. Women experienced greater heat strain than men. These findings demonstrate that research-grade wearables are acceptable and feasible for real-time heat stress monitoring in rural Africa.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"66 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143915617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kensuke Sakata, Carolyna A. P. Yamamoto, Adityo Prakosa, Brock M. Tice, Syed Yusuf Ali, Shane Loeffler, Eugene G. Kholmovski, Sunil Kumar Sinha, Joseph E. Marine, Hugh Calkins, David D. Spragg, Natalia A. Trayanova
{"title":"Digital twins enable stratification of persistent atrial fibrillation patients for ablation diminishing unnecessary heart damage","authors":"Kensuke Sakata, Carolyna A. P. Yamamoto, Adityo Prakosa, Brock M. Tice, Syed Yusuf Ali, Shane Loeffler, Eugene G. Kholmovski, Sunil Kumar Sinha, Joseph E. Marine, Hugh Calkins, David D. Spragg, Natalia A. Trayanova","doi":"10.1038/s41746-025-01625-y","DOIUrl":"https://doi.org/10.1038/s41746-025-01625-y","url":null,"abstract":"<p>Pulmonary vein isolation (PVI), the standard-of-care for atrial fibrillation (AF), is effective even in some persistent AF (PsAF) patients despite atrial fibrosis proliferation, suggesting that PVI could not only be isolating triggers but diminishing arrhythmogenic substrates. Left atrial (LA) posterior wall isolation is the prevalent adjunctive strategy aiming to address PsAF arrhythmogenesis, however, its outcomes vary widely. To explore why current PsAF ablation treatments have limited success and under what circumstances each treatment is most effective, we utilized patient-specific heart digital twins of PsAF patients incorporating fibrosis distributions to virtually implement versions of PVI (individual ostial to wide antral) and posterior wall isolation. In most digital-twins (60%) PVI greatly decreased LA substrate arrhythmogenicity without the need of wider lesions or posterior wall isolation. Using digital-twin findings, a strategy was developed to stratify PsAF patients to an appropriate ablation option based on fibrosis features, thus potentially avoiding unnecessary heart damage.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"9 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143915486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zixuan Hu, Hui Ming Lin, Shobhit Mathur, Robert Moreland, Christopher D. Witiw, Laura Jimenez-Juan, Matias F. Callejas, Djeven P. Deva, Ervin Sejdić, Errol Colak
{"title":"High performance with fewer labels using semi-weakly supervised learning for pulmonary embolism diagnosis","authors":"Zixuan Hu, Hui Ming Lin, Shobhit Mathur, Robert Moreland, Christopher D. Witiw, Laura Jimenez-Juan, Matias F. Callejas, Djeven P. Deva, Ervin Sejdić, Errol Colak","doi":"10.1038/s41746-025-01594-2","DOIUrl":"https://doi.org/10.1038/s41746-025-01594-2","url":null,"abstract":"<p>This study proposes a semi-weakly supervised learning approach for pulmonary embolism (PE) detection on CT pulmonary angiography (CTPA) to alleviate the resource-intensive burden of exhaustive medical image annotation. Attention-based CNN-RNN models were trained on the RSNA pulmonary embolism CT dataset and externally validated on a pooled dataset (Aida and FUMPE). Three configurations included weak (examination-level labels only), strong (all examination and slice-level labels), and semi-weak (examination-level labels plus a limited subset of slice-level labels). The proportion of slice-level labels varying from 0 to 100%. Notably, semi-weakly supervised models using approximately one-quarter of the total slice-level labels achieved an AUC of 0.928, closely matching the strongly supervised model’s AUC of 0.932. External validation yielded AUCs of 0.999 for the semi-weak and 1.000 for the strong model. By reducing labeling requirements without sacrificing diagnostic accuracy, this method streamlines model development, accelerates the integration of models into clinical practice, and enhances patient care.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"8 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143915484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neil F. Abernethy, Kylie McCloskey, Meg Trahey, Laurie Rinn, Gail Broder, Michele Andrasik, Rebecca Laborde, Daniel McGhan, Scott Spendolini, Senthil Marimuthu, Adam Kanzmeier, Jayson Hanes, James G. Kublin
{"title":"Rapid development of a registry to accelerate COVID-19 vaccine clinical trials","authors":"Neil F. Abernethy, Kylie McCloskey, Meg Trahey, Laurie Rinn, Gail Broder, Michele Andrasik, Rebecca Laborde, Daniel McGhan, Scott Spendolini, Senthil Marimuthu, Adam Kanzmeier, Jayson Hanes, James G. Kublin","doi":"10.1038/s41746-025-01666-3","DOIUrl":"https://doi.org/10.1038/s41746-025-01666-3","url":null,"abstract":"<p>Response to the SARS-Cov-2 pandemic required the unprecedented, rapid activation of the COVID-19 Prevention Network (CoVPN) representing hundreds of sites conducting vaccine clinical trials. The CoVPN Volunteer Screening Registry (VSR) collected participant information, distributed qualified candidates across sites, and monitored enrollment progress. The system consisted of three web-based interfaces. The Volunteer Questionnaire flowed into a secure database. The Site Portal supported volunteer selection, analytics, and enrollment. The Administrative Portal enabled dynamic analytic reports by geography, clinical trial, and site, including volunteering rates over time. The VSR collected over 650,000 volunteers, serving a key role in the recruitment of diverse participants for multiple Phase 3 clinical trials. Over 47% of the 166,729 volunteers selected for screening represented prioritized groups. The success of the VSR demonstrates how digital tools can be rapidly yet safely integrated into an accelerated clinical trial setting. We summarize the development of the system and lessons learned for pandemic preparedness.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"51 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143909704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}