Journal of the American Medical Informatics Association最新文献

筛选
英文 中文
Towards responsible artificial intelligence in healthcare-getting real about real-world data and evidence. 在医疗保健领域实现负责任的人工智能——真实地对待现实世界的数据和证据。
IF 4.6 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-09-26 DOI: 10.1093/jamia/ocaf133
Eileen Koski, Amar Das, Pei-Yun Sabrina Hsueh, Anthony Solomonides, Amanda L Joseph, Gyana Srivastava, Carl Erwin Johnson, Joseph Kannry, Bilikis Oladimeji, Amy Price, Steven Labkoff, Gnana Bharathy, Baihan Lin, Douglas Fridsma, Lee A Fleisher, Monica Lopez-Gonzalez, Reva Singh, Mark G Weiner, Robert Stolper, Russell Baris, Suzanne Sincavage, Tristan Naumann, Tayler Williams, Tien Thi Thuy Bui, Yuri Quintana
{"title":"Towards responsible artificial intelligence in healthcare-getting real about real-world data and evidence.","authors":"Eileen Koski, Amar Das, Pei-Yun Sabrina Hsueh, Anthony Solomonides, Amanda L Joseph, Gyana Srivastava, Carl Erwin Johnson, Joseph Kannry, Bilikis Oladimeji, Amy Price, Steven Labkoff, Gnana Bharathy, Baihan Lin, Douglas Fridsma, Lee A Fleisher, Monica Lopez-Gonzalez, Reva Singh, Mark G Weiner, Robert Stolper, Russell Baris, Suzanne Sincavage, Tristan Naumann, Tayler Williams, Tien Thi Thuy Bui, Yuri Quintana","doi":"10.1093/jamia/ocaf133","DOIUrl":"https://doi.org/10.1093/jamia/ocaf133","url":null,"abstract":"<p><strong>Background: </strong>The use of real-world data (RWD) in artificial intelligence (AI) applications for healthcare offers unique opportunities but also poses complex challenges related to interpretability, transparency, safety, efficacy, bias, equity, privacy, ethics, accountability, and stakeholder engagement.</p><p><strong>Methods: </strong>A multi-stakeholder expert panel comprising healthcare professionals, AI developers, policymakers, and other stakeholders was assembled. Their task was to identify critical issues and formulate consensus recommendations, focusing on the responsible use of RWD in healthcare AI. The panel's work involved an in-person conference and workshop and extensive deliberations over several months.</p><p><strong>Results: </strong>The panel's findings revealed several critical challenges, including the necessity for data literacy and documentation, the identification and mitigation of bias, privacy and ethics considerations, and the absence of an accountability structure for stakeholder management. To address these, the panel proposed a series of recommendations, such as the adoption of metadata standards for RWD sources, the development of transparency frameworks and instructional labels likened to \"nutrition labels\" for AI applications, the provision of cross-disciplinary training materials, the implementation of bias detection and mitigation strategies, and the establishment of ongoing monitoring and update processes.</p><p><strong>Conclusion: </strong>Guidelines and resources focused on the responsible use of RWD in healthcare AI are essential for developing safe, effective, equitable, and trustworthy applications. The proposed recommendations provide a foundation for a comprehensive framework addressing the entire lifecycle of healthcare AI, emphasizing the importance of documentation, training, transparency, accountability, and multi-stakeholder engagement.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145151534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A communication-efficient federated learning algorithm to assess racial disparities in post-transplantation survival time. 一种有效沟通的联邦学习算法评估移植后生存时间的种族差异。
IF 4.6 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-09-24 DOI: 10.1093/jamia/ocaf138
Yudong Wang, Dazheng Zhang, Jiayi Tong, Xing He, Liang Li, Lichao Sun, Ashutosh M Shukla, Jiang Bian, David A Asch, Yong Chen
{"title":"A communication-efficient federated learning algorithm to assess racial disparities in post-transplantation survival time.","authors":"Yudong Wang, Dazheng Zhang, Jiayi Tong, Xing He, Liang Li, Lichao Sun, Ashutosh M Shukla, Jiang Bian, David A Asch, Yong Chen","doi":"10.1093/jamia/ocaf138","DOIUrl":"https://doi.org/10.1093/jamia/ocaf138","url":null,"abstract":"<p><strong>Objective: </strong>Patients of different race have different outcomes following renal transplantation. Patients of different race also undergo renal transplantation at different hospitals. We used a novel decentralized multisite approach to quantitatively assess the effect of site of care on racial disparities between non-Hispanic Black (NHB) and non-Hispanic White (NHW) patients in post-transplantation survival times.</p><p><strong>Materials and methods: </strong>In this study, we develop a communication-efficient federated learning algorithm to assess site-of-care associated racial disparities based on decentralized time-to-event data, called Communication-Efficient Distributed Analysis for Racial Disparity in Time-to-event Data (CEDAR-t2e). The algorithm includes 2 modules. Module I is to estimate the site-specific proportional hazards model for time-to-event outcomes in a distributed manner, in which the Poissonization is used to simplify the estimation procedure. Based on the estimated results from Module I, Module II calculates how long the kidney failure time of NHB patients would be extended had they been admitted to transplant centers in the same distribution as NHW patients were admitted.</p><p><strong>Results: </strong>With application to United States Renal Data System data covering 39 043 patients across 73 transplant centers, we found no evidence suggesting the presence of site-of-care associated racial disparities in post-transplantation survival times. In particular, restricting to one year after transplantation, the counterfactual graft failure time would have been extended by only 0.61 days on average if NHB had the same admission distribution to transplant centers as NHW patients.</p><p><strong>Discussion: </strong>The proposed approach offers a quantitative measure to evaluate site-of-care associated racial disparities.</p><p><strong>Conclusion: </strong>Our approach has the potential to be extended to investigate site-of-care related disparities in other time-to-event outcomes, thus promoting health equity and improving patient health in various fields.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and application of desiderata for automated clinical ordering. 临床自动点单所需数据的开发与应用。
IF 4.6 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-09-23 DOI: 10.1093/jamia/ocaf152
Sameh N Saleh, Kevin B Johnson
{"title":"Development and application of desiderata for automated clinical ordering.","authors":"Sameh N Saleh, Kevin B Johnson","doi":"10.1093/jamia/ocaf152","DOIUrl":"https://doi.org/10.1093/jamia/ocaf152","url":null,"abstract":"<p><strong>Introduction: </strong>Automation of clinical orders in electronic health records (EHRs) has the potential to reduce clinician burden and enhance patient safety. However, determining which orders are appropriate for automation requires a structured framework to ensure clinical validity, transparency, and safety.</p><p><strong>Objective: </strong>To develop and validate a framework of desiderata for assessing the appropriateness of automating clinical orders in EHRs and to demonstrate its operational value in a live health system dataset.</p><p><strong>Materials and methods: </strong>The study comprised 4 phases to move from concept generation to real-world demonstration. First, we conducted focus group analyses using ground theory to identify themes and developed desiderata informed by these themes and existing literature. We validated the desiderata by surveying clinicians at a single institution, presenting 10 use cases to and assessing perceived appropriateness, cognitive support, and patient safety using a 4-point Likert scale. Survey results were compared to a priori appropriateness designations using t-tests. To evaluate operational impact, we analyzed one year of order-based alerts and orders (1.4 million firings alert and 44.1 million orders, respectively) using filtering rules and association rule mining to identify candidate orders for automation and their impact.</p><p><strong>Results: </strong>We identified 8 desiderata for automated order appropriateness: logical consistency, data provenance, order transparency, context permanence, monitoring plans, trigger consistency, care team empowerment, and system accountability. Use cases deemed appropriate based on these criteria received significantly higher scores for appropriateness (3.13 ± 0.84 vs 2.30 ± 0.99), cognitive support (3.08 ± 0.82 vs 2.25 ± 0.94), and patient safety (3.08 ± 0.86 vs 2.21 ± 0.98) (all P < .001) compared to those considered inappropriate. Operational analysis revealed an alert firing 19 109 times annually, with a 96% signed order rate, where automation could save an estimated 26.5 provider hours per year. Additionally, an association rule with 16 628 occurrences (68.4% confidence) suggested automation could save 15.8 hours annually and yield 8000 additional appropriate orders.</p><p><strong>Discussion: </strong>The desiderata align with clinician perceptions and provide a structured approach for evaluating automated orders. Our findings highlight the potential for automation of certain clinical orders to improve cognitive support while maintaining patient safety.</p><p><strong>Conclusion: </strong>Healthcare systems should use these desiderata, coupled with data mining techniques, to systematically identify and govern appropriate automated orders. Further research is needed to validate operational scalability.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145126405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Including AI in diffusion-weighted breast MRI has potential to increase reader confidence and reduce workload. 在弥散加权乳房MRI中加入人工智能有可能增加读者的信心并减少工作量。
IF 4.6 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-09-23 DOI: 10.1093/jamia/ocaf156
Dimitrios Bounias, Lina Simons, Michael Baumgartner, Chris Ehring, Peter Neher, Lorenz A Kapsner, Balint Kovacs, Ralf Floca, Paul F Jaeger, Jessica Eberle, Dominique Hadler, Frederik B Laun, Sabine Ohlmeyer, Lena Maier-Hein, Michael Uder, Evelyn Wenkel, Klaus H Maier-Hein, Sebastian Bickelhaupt
{"title":"Including AI in diffusion-weighted breast MRI has potential to increase reader confidence and reduce workload.","authors":"Dimitrios Bounias, Lina Simons, Michael Baumgartner, Chris Ehring, Peter Neher, Lorenz A Kapsner, Balint Kovacs, Ralf Floca, Paul F Jaeger, Jessica Eberle, Dominique Hadler, Frederik B Laun, Sabine Ohlmeyer, Lena Maier-Hein, Michael Uder, Evelyn Wenkel, Klaus H Maier-Hein, Sebastian Bickelhaupt","doi":"10.1093/jamia/ocaf156","DOIUrl":"https://doi.org/10.1093/jamia/ocaf156","url":null,"abstract":"<p><strong>Objectives: </strong>Breast diffusion-weighted imaging (DWI) has shown potential as a standalone imaging technique for certain indications, eg, supplemental screening of women with dense breasts. This study evaluates an artificial intelligence (AI)-powered computer-aided diagnosis (CAD) system for clinical interpretation and workload reduction in breast DWI.</p><p><strong>Materials and methods: </strong>This retrospective IRB-approved study included: n = 824 examinations for model development (2017-2020) and n = 235 for evaluation (01/2021-06/2021). Readings were performed by three readers using either the AI-CAD or manual readings. BI-RADS-like (Breast Imaging Reporting and Data System) classification was based on DWI. Histopathology served as ground truth. The model was nnDetection-based, trained using 5-fold cross-validation and ensembling. Statistical significance was determined using McNemar's test. Inter-rater agreement was calculated using Cohen's kappa. Model performance was calculated using the area under the receiver operating curve (AUC).</p><p><strong>Results: </strong>The AI-augmented approach significantly reduced BI-RADS-like 3 calls in breast DWI by 29% (P =.019) and increased interrater agreement (0.57 ± 0.10 vs 0.49 ± 0.11), while preserving diagnostic accuracy. Two of the three readers detected more malignant lesions (63/69 vs 59/69 and 64/69 vs 62/69) with the AI-CAD. The AI model achieved an AUC of 0.78 (95% CI: [0.72, 0.85]; P <.001), which increased for women at screening age to 0.82 (95% CI: [0.73, 0.90]; P <.001), indicating a potential for workload reduction of 20.9% at 96% sensitivity.</p><p><strong>Discussion and conclusion: </strong>Breast DWI might benefit from AI support. In our study, AI showed potential for reduction of BI-RADS-like 3 calls and increase of inter-rater agreement. However, given the limited study size, further research is needed.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145126441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A scalable framework for benchmark embedding models in semantic health-care tasks. 用于语义医疗保健任务中基准嵌入模型的可扩展框架。
IF 4.6 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-09-22 DOI: 10.1093/jamia/ocaf149
Shelly Soffer, Mahmud Omar, Moran Gendler, Benjamin S Glicksberg, Patricia Kovatch, Orly Efros, Robert Freeman, Alexander W Charney, Girish N Nadkarni, Eyal Klang
{"title":"A scalable framework for benchmark embedding models in semantic health-care tasks.","authors":"Shelly Soffer, Mahmud Omar, Moran Gendler, Benjamin S Glicksberg, Patricia Kovatch, Orly Efros, Robert Freeman, Alexander W Charney, Girish N Nadkarni, Eyal Klang","doi":"10.1093/jamia/ocaf149","DOIUrl":"https://doi.org/10.1093/jamia/ocaf149","url":null,"abstract":"<p><strong>Objectives: </strong>Text embeddings are promising for semantic tasks, such as retrieval augmented generation (RAG). However, their application in health care is underexplored due to a lack of benchmarking methods. We introduce a scalable benchmarking method to test embeddings for health-care semantic tasks.</p><p><strong>Materials and methods: </strong>We evaluated 39 embedding models across 7 medical semantic similarity tasks using diverse datasets. These datasets comprised real-world patient data (from the Mount Sinai Health System and MIMIC IV), biomedical texts from PubMed, and synthetic data generated with Llama-3-70b. We first assessed semantic textual similarity (STS) by correlating the model-generated similarity scores with noise levels using Spearman rank correlation. We then reframed the same tasks as retrieval problems, evaluated by mean reciprocal rank and recall at k.</p><p><strong>Results: </strong>In total, evaluating 2000 text pairs per 7 tasks for STS and retrieval yielded 3.28 million model assessments. Larger models (>7b parameters), such as those based on Mistral-7b and Gemma-2-9b, consistently performed well, especially in long-context tasks. The NV-Embed-v1 model (7b parameters), although top in short tasks, underperformed in long tasks. For short tasks, smaller models such as b1ade-embed (335M parameters) performed on-par to the larger models. For long retrieval tasks, the larger models significantly outperformed the smaller ones.</p><p><strong>Discussion: </strong>The proposed benchmarking framework demonstrates scalability and flexibility, offering a structured approach to guide the selection of embedding models for a wide range of health-care tasks.</p><p><strong>Conclusion: </strong>By matching the appropriate model with the task, the framework enables more effective deployment of embedding models, enhancing critical applications such as semantic search and retrieval-augmented generation (RAG).</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145115030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting treatment retention in medication for opioid use disorder: a machine learning approach using NLP and LLM-derived clinical features. 预测阿片类药物使用障碍的药物治疗保留:使用NLP和llm衍生临床特征的机器学习方法。
IF 4.6 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-09-22 DOI: 10.1093/jamia/ocaf157
Fateme Nateghi Haredasht, Ivan Lopez, Steven Tate, Pooya Ashtari, Min Min Chan, Deepali Kulkarni, Chwen-Yuen Angie Chen, Maithri Vangala, Kira Griffith, Bryan Bunning, Adam S Miner, Tina Hernandez-Boussard, Keith Humphreys, Anna Lembke, L Alexander Vance, Jonathan H Chen
{"title":"Predicting treatment retention in medication for opioid use disorder: a machine learning approach using NLP and LLM-derived clinical features.","authors":"Fateme Nateghi Haredasht, Ivan Lopez, Steven Tate, Pooya Ashtari, Min Min Chan, Deepali Kulkarni, Chwen-Yuen Angie Chen, Maithri Vangala, Kira Griffith, Bryan Bunning, Adam S Miner, Tina Hernandez-Boussard, Keith Humphreys, Anna Lembke, L Alexander Vance, Jonathan H Chen","doi":"10.1093/jamia/ocaf157","DOIUrl":"https://doi.org/10.1093/jamia/ocaf157","url":null,"abstract":"<p><strong>Objective: </strong>Building upon our previous work on predicting treatment retention in medications for opioid use disorder, we aimed to improve 6-month retention prediction in buprenorphine-naloxone (BUP-NAL) therapy by incorporating features derived from large language models (LLMs) applied to unstructured clinical notes.</p><p><strong>Materials and methods: </strong>We used de-identified electronic health record (EHR) data from Stanford Health Care (STARR) for model development and internal validation, and the NeuroBlu behavioral health database for external validation. Structured features were supplemented with 13 clinical and psychosocial features extracted from free-text notes using the CLinical Entity Augmented Retrieval pipeline, which combines named entity recognition with LLM-based classification to provide contextual interpretation. We trained classification (Logistic Regression, Random Forest, XGBoost) and survival models (CoxPH, Random Survival Forest, Survival XGBoost), evaluated using Receiver Operating Characteristic-Area Under the Curve (ROC-AUC) and C-index.</p><p><strong>Results: </strong>XGBoost achieved the highest classification performance (ROC-AUC = 0.65). Incorporating LLM-derived features improved model performance across all architectures, with the largest gains observed in simpler models such as Logistic Regression. In time-to-event analysis, Random Survival Forest and Survival XGBoost reached the highest C-index (≈0.65). SHapley Additive exPlanations analysis identified LLM-extracted features like Chronic Pain, Liver Disease, and Major Depression as key predictors. We also developed an interactive web tool for real-time clinical use.</p><p><strong>Discussion: </strong>Features extracted using NLP and LLM-assisted methods improved model accuracy and interpretability, revealing valuable psychosocial risks not captured in structured EHRs.</p><p><strong>Conclusion: </strong>Combining structured EHR data with LLM-extracted features moderately improves BUP-NAL retention prediction, enabling personalized risk stratification and advancing AI-driven care for substance use disorders.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145114959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large language models accurately identify immunosuppression in intensive care unit patients. 大型语言模型准确识别重症监护病房患者的免疫抑制。
IF 4.6 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-09-22 DOI: 10.1093/jamia/ocaf141
Vijeeth Guggilla, Mengjia Kang, Melissa J Bak, Steven D Tran, Anna Pawlowski, Prasanth Nannapaneni, Luke V Rasmussen, Daniel Schneider, Helen K Donnelly, Ankit Agrawal, David Liebovitz, Alexander V Misharin, G R Scott Budinger, Richard G Wunderink, Theresa L Walunas, Catherine A Gao
{"title":"Large language models accurately identify immunosuppression in intensive care unit patients.","authors":"Vijeeth Guggilla, Mengjia Kang, Melissa J Bak, Steven D Tran, Anna Pawlowski, Prasanth Nannapaneni, Luke V Rasmussen, Daniel Schneider, Helen K Donnelly, Ankit Agrawal, David Liebovitz, Alexander V Misharin, G R Scott Budinger, Richard G Wunderink, Theresa L Walunas, Catherine A Gao","doi":"10.1093/jamia/ocaf141","DOIUrl":"10.1093/jamia/ocaf141","url":null,"abstract":"<p><strong>Objective: </strong>Rule-based structured data algorithms and natural language processing (NLP) approaches applied to unstructured clinical notes have limited accuracy and poor generalizability for identifying immunosuppression. Large language models (LLMs) may effectively identify patients with heterogenous types of immunosuppression from unstructured clinical notes. We compared the performance of LLMs applied to unstructured notes for identifying patients with immunosuppressive conditions or immunosuppressive medication use against 2 baselines: (1) structured data algorithms using diagnosis codes and medication orders and (2) NLP approaches applied to unstructured notes.</p><p><strong>Materials and methods: </strong>We used hospital admission notes from a primary cohort of 827 intensive care unit (ICU) patients at Northwestern Memorial Hospital and a validation cohort of 200 ICU patients at Beth Israel Deaconess Medical Center, along with diagnosis codes and medication orders from the primary cohort. We evaluated the performance of structured data algorithms, NLP approaches, and LLMs in identifying 7 immunosuppressive conditions and 6 immunosuppressive medications.</p><p><strong>Results: </strong>In the primary cohort, structured data algorithms achieved peak F1 scores ranging from 0.30 to 0.97 for identifying immunosuppressive conditions and medications. NLP approaches achieved peak F1 scores ranging from 0 to 1. GPT-4o outperformed or matched structured data algorithms and NLP approaches across all conditions and medications, with F1 scores ranging from 0.51 to 1. GPT-4o also performed impressively in our validation cohort (F1 = 1 for 8/13 variables).</p><p><strong>Discussion: </strong>LLMs, particularly GPT-4o, outperformed structured data algorithms and NLP approaches in identifying immunosuppressive conditions and medications with robust external validation.</p><p><strong>Conclusion: </strong>LLMs can be applied for improved cohort identification for research purposes.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490808/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145114981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving postoperative length of stay forecasting with retrieval-augmented prediction. 利用检索增强预测改进术后住院时间预测。
IF 4.6 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-09-18 DOI: 10.1093/jamia/ocaf154
Brian H Park, Chun-Nan Hsu, Austin Nguyen, Ying Q Zhou, Rodney A Gabriel
{"title":"Improving postoperative length of stay forecasting with retrieval-augmented prediction.","authors":"Brian H Park, Chun-Nan Hsu, Austin Nguyen, Ying Q Zhou, Rodney A Gabriel","doi":"10.1093/jamia/ocaf154","DOIUrl":"https://doi.org/10.1093/jamia/ocaf154","url":null,"abstract":"<p><strong>Objective: </strong>The objective of this study is to evaluate retrieval-augmented prediction for forecasting hospital length of stay (LOS) following surgery compared to traditional machine learning (ML), standalone large language models (LLMs), and retrieval-augmented generation (RAG) approaches.</p><p><strong>Materials and methods: </strong>Spine surgery cases were extracted from electronic health records. Structured features and operative notes were concatenated into natural language patient representations, embedded using Sentence-Bidirectional Encoder Representations from Transformer, and stored in a vector database. Eight predictive models were implemented, including a baseline model, standalone ML with embeddings, standalone LLM (Gemma 3:27B), and combinations of these with retrieval-augmented prediction or generation. The retrieval-augmented prediction model computed a similarity-weighted average LOS from nearest neighbors. Performance was assessed using R2, mean absolute value (MAE), and root mean square error (RMSE).</p><p><strong>Results: </strong>Retrieval-augmented prediction alone outperformed standalone ML and LLM models (R2 = 0.39, MAE = 4.47). Combining ML or LLM outputs with retrieval-augmented prediction further improved performance. The best performing model was a neural network blended with retrieval-augmented prediction (R2 = 0.52, MAE = 4.16). LLM-RAG alone reached R2 = 0.19, which improved to 0.47 when combined with retrieval-augmented predictions. Retrieval-augmented prediction consistently reduced MAE and RMSE by up to 32% and 38%, respectively.</p><p><strong>Discussion: </strong>Retrieval-augmented prediction offers interpretable and resource-efficient forecasting by semantically leveraging prior patient cases without generative modeling. It consistently outperformed RAG and ML across metrics, approximating clinical reasoning via similarity-based inference.</p><p><strong>Conclusion: </strong>Retrieval-augmented prediction significantly enhances LOS prediction accuracy over standard ML and LLM models. Its interpretability and scalability make it a promising solution for integrating predictive analytics into clinical workflows.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145092790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hillclimb-Causal Inference: a data-driven approach to identify causal pathways among parental behaviors, genetic risk, and externalizing behaviors in children. 爬山-因果推理:一种数据驱动的方法来识别父母行为、遗传风险和儿童外化行为之间的因果途径。
IF 4.6 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-09-17 DOI: 10.1093/jamia/ocaf153
Mengman Wei, Qian Peng
{"title":"Hillclimb-Causal Inference: a data-driven approach to identify causal pathways among parental behaviors, genetic risk, and externalizing behaviors in children.","authors":"Mengman Wei, Qian Peng","doi":"10.1093/jamia/ocaf153","DOIUrl":"https://doi.org/10.1093/jamia/ocaf153","url":null,"abstract":"<p><strong>Objectives: </strong>Externalizing behaviors in children, such as aggression, hyperactivity, and defiance, are influenced by complex interplays between genetic predispositions and environmental factors, particularly parental behaviors. Unraveling these intricate causal relationships can benefit from the use of robust data-driven methods.</p><p><strong>Materials and methods: </strong>We developed \"Hillclimb-Causal Inference,\" a causal discovery approach that integrates the Hill Climb Search algorithm with a customized Linear Gaussian Bayesian Information Criterion (BIC). This method was applied to data from the Adolescent Brain Cognitive Development (ABCD) Study, which included parental behavior assessments, children's genotypes, and externalizing behavior measures. We performed dimensionality reduction to address multicollinearity among parental behaviors and assessed children's genetic risk for externalizing disorders using polygenic risk scores (PRS), which were computed based on GWAS summary statistics from independent cohorts. Once the causal pathways were identified, we employed structural equation modeling (SEM) to quantify the relationships within the model.</p><p><strong>Results: </strong>We identified prominent causal pathways linking parental behaviors to children's externalizing outcomes. Parental alcohol misuse and broader behavioral issues exhibited notably stronger direct effects (0.33 and 0.20, respectively) compared to children's PRS (0.07). Moreover, when considering both direct and indirect paths, parental substance misuse (alcohol, drugs, and tobacco) collectively resulted in a total effect exceeding 1.1 on externalizing behaviors. Bootstrap and sensitivity analyses further validated the robustness of these findings.</p><p><strong>Discussion and conclusion: </strong>Parental behaviors exert larger effects on children's externalizing outcomes than genetic risk, suggesting potential targets for prevention and intervention. The Hillclimb-Causal framework provides a general, data-driven way to map causal pathways in developmental psychiatry and related domains.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145092776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A self-report measure of digital skills needed to use digital health tools among older adults-the Skills Measurement and Readiness Training for Digital Health (SMART Digital Health) Scale. 对老年人使用数字健康工具所需的数字技能进行自我报告测量——数字健康技能测量和准备培训(SMART数字健康)量表。
IF 4.6 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-09-13 DOI: 10.1093/jamia/ocaf151
Lina Tieu, Courtney R Lyles, Hyunjin Cindy Kim, Isabel Luna, Jeanette Wong, Naomi Lopez-Solano, Junhong Li, Andersen Yang, Jorge A Rodriguez, Oanh Kieu Nguyen, Alejandra Casillas, Emilia H De Marchis, Anita L Stewart, Torsten B Neilands, Elaine C Khoong
{"title":"A self-report measure of digital skills needed to use digital health tools among older adults-the Skills Measurement and Readiness Training for Digital Health (SMART Digital Health) Scale.","authors":"Lina Tieu, Courtney R Lyles, Hyunjin Cindy Kim, Isabel Luna, Jeanette Wong, Naomi Lopez-Solano, Junhong Li, Andersen Yang, Jorge A Rodriguez, Oanh Kieu Nguyen, Alejandra Casillas, Emilia H De Marchis, Anita L Stewart, Torsten B Neilands, Elaine C Khoong","doi":"10.1093/jamia/ocaf151","DOIUrl":"https://doi.org/10.1093/jamia/ocaf151","url":null,"abstract":"<p><strong>Objective: </strong>To identify a brief scale to accurately assess digital skills among older adults for use in identifying need for support to use digital health tools.</p><p><strong>Materials and methods: </strong>Patients age ≥50 speaking English, Spanish, or Cantonese completed surveys (n = 186) assessing digital health access, use, and skills. A subsample (n = 101) completed observational task assessments gauging competency on 4 tasks essential to digital health skills: (1) launch a video visit from an email/text message hyperlink, (2) visit a specific health website, (3) sign up for a patient portal, and (4) log in to a patient portal. We used exploratory factor analysis, receiver operator characteristic, logistic regression, and dominance analysis methods to identify and evaluate a scale measuring digital skills essential to using digital health tools.</p><p><strong>Results: </strong>We found that a 9-item scale demonstrated unidimensionality and reliability (Cronbach's alpha 0.93) in measuring digital skills. Mean score was 19.3 out of 36. For each task, handout/video support was inadequate in facilitating completion for one-quarter of participants. We found high accuracy of the scale in predicting digital health competency (area under the curve 0.77-0.88).</p><p><strong>Discussion: </strong>The Skills Measurement and Readiness Training for Digital Health (SMART Digital Health) scale is a measure of digital skills with evidence of reliability and validity to be used as a diagnostic tool to identify patients requiring support to use digital health tools.</p><p><strong>Conclusion: </strong>This early work supports the identification of patients with digital literacy needs who may require interventions to effectively engage in digital health communication and management.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145092691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信