Shane J Sacco, Kun Chen, Fei Wang, Steven C Rogers, Robert H Aseltine
{"title":"Using transfer learning to improve prediction of suicide risk in acute care hospitals.","authors":"Shane J Sacco, Kun Chen, Fei Wang, Steven C Rogers, Robert H Aseltine","doi":"10.1093/jamia/ocaf126","DOIUrl":"https://doi.org/10.1093/jamia/ocaf126","url":null,"abstract":"<p><strong>Objective: </strong>Emerging efforts to identify patients at risk of suicide have focused on the development of predictive algorithms for use in healthcare settings. We address a major challenge in effective risk modeling in healthcare settings with insufficient data with which to create and apply risk models. This study aimed to improve risk prediction using transfer learning or data fusion by incorporating risk information from external data sources to augment the data available in particular clinical settings.</p><p><strong>Materials and methods: </strong>In this retrospective study, we developed predictive models in individual Connecticut hospitals using medical claims data. We compared conventional models containing demographics and historical medical diagnosis codes with fusion models containing conventional features and fused risk information that described similarities in historical diagnosis codes between patients from the hospital and patients receiving care for suicide attempts at other hospitals.</p><p><strong>Results: </strong>Our sample contained 27 hospitals and 636 758 18- to 64-year-old patients. Fusion improved prediction for 93% of hospitals, while slightly worsening prediction for 7%. Median areas under the ROC and precision-recall curves of conventional models were 77.6% and 3.4%, respectively. Fusion improved these metrics by a median of 3.3 and 0.3 points, respectively (Ps < .001). Median sensitivities and positive predictive values at 90% and 95% specificity were also improved (Ps < .001).</p><p><strong>Discussion: </strong>This study provided strong evidence that data fusion improved model performance across hospitals. Improvement was of greatest magnitude in facilities treating relatively few suicidal patients.</p><p><strong>Conclusion: </strong>Data fusion holds promise as a methodology to improve suicide risk prediction in healthcare settings with limited or incomplete data.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144715164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marcin J Domagalski, Yin Lu, Alexander Pilozzi, Alicia Williamson, Padmini Chilappagari, Emma Luker, Courtney D Shelley, Anya Dabic, Michael A Keller, Rebecca M Rodriguez, Sharon Lawlor, Ratna R Thangudu
{"title":"Preparing clinical research data for artificial intelligence readiness: insights from the National Institute of Diabetes and Digestive and Kidney Diseases data centric challenge.","authors":"Marcin J Domagalski, Yin Lu, Alexander Pilozzi, Alicia Williamson, Padmini Chilappagari, Emma Luker, Courtney D Shelley, Anya Dabic, Michael A Keller, Rebecca M Rodriguez, Sharon Lawlor, Ratna R Thangudu","doi":"10.1093/jamia/ocaf114","DOIUrl":"https://doi.org/10.1093/jamia/ocaf114","url":null,"abstract":"<p><strong>Objectives: </strong>The success of artificial intelligence (AI) and machine learning (ML) approaches in biomedical research depends on the quality of the underlying data. The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Data Centric Challenge was designed to address the challenge of making raw clinical research data AI ready, with a focus on type 1 diabetes studies available in the NIDDK Central Repository (NIDDK-CR). This paper aims to present a structured methodology for enhancing the AI readiness of clinical datasets.</p><p><strong>Materials and methods: </strong>We detail a systematic approach for data aggregation and preprocessing, including binning continuous data, processing text features, managing missing values, and encoding for categorical variables while maintaining the data integrity and compatibility with ML algorithms.</p><p><strong>Results: </strong>We applied the proposed methodology to transform raw clinical data from type 1 diabetes studies in the NIDDK-CR into a structured, AI-ready dataset. The evaluation process validated the effectiveness of our AI-readiness enhancement steps and explored the potential use cases in type 1 diabetes research.</p><p><strong>Discussion: </strong>The methodology discussed in this paper will serve as guidance for preparing data for AI-driven clinical research, with the resulting AI-ready data to serve as a training tool for building and improving AI/ML model performance.</p><p><strong>Conclusion: </strong>We present a generalizable framework for preparing clinical research data for AI applications. The resulting datasets lay a strong foundation for downstream AI/ML applications, setting the stage for a new era of data-driven discoveries.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144709709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel Nycklemoe, Sriharsha Devarapu, Yanjun Gao, Kyle Carey, Nicholas Kuehnel, Neil Munjal, Priti Jani, Matthew Churpek, Dmitriy Dligach, Majid Afshar, Anoop Mayampurath
{"title":"Explaining alerts from a pediatric risk prediction model using clinical text.","authors":"Samuel Nycklemoe, Sriharsha Devarapu, Yanjun Gao, Kyle Carey, Nicholas Kuehnel, Neil Munjal, Priti Jani, Matthew Churpek, Dmitriy Dligach, Majid Afshar, Anoop Mayampurath","doi":"10.1093/jamia/ocaf121","DOIUrl":"https://doi.org/10.1093/jamia/ocaf121","url":null,"abstract":"<p><strong>Objective: </strong>Risk prediction models are used in hospitals to identify pediatric patients at risk of clinical deterioration, enabling timely interventions and rescue. The objective of this study was to develop a new explainer algorithm that uses a patient's clinical notes to generate text-based explanations for risk prediction alerts.</p><p><strong>Materials and methods: </strong>We conducted a retrospective study of 39 406 patient admissions to the American Family Children's Hospital at the University of Wisconsin-Madison (2009-2020). The pediatric Calculated Assessment of Risk and Triage (pCART) validated risk prediction model was used to identify children at risk for deterioration. A transformer model was trained to use clinical notes from the 12-hour period preceding each pCART score to predict whether a patient was flagged as at risk. Then, label-aware attention highlighted text phrases most important to an at-risk alert. The study cohort was randomly split into derivation (60%) and validation (20%) data, and a separate test (20%) was used to evaluate the explainer's performance.</p><p><strong>Results: </strong>Our pCART Explainer algorithm performed well in discriminating at-risk pCART alert vs no alert (c-statistic 0.805). Sample explanations from pCART Explainer revealed clinically important phrases such as \"rapid breathing,\" \"fall risk,\" \"distension,\" and \"grunting,\" thereby demonstrating excellent face validity.</p><p><strong>Discussion: </strong>The pCART Explainer could quickly orient clinicians to the patient's condition by drawing attention to key phrases in notes, potentially enhancing situational awareness and guiding decision-making.</p><p><strong>Conclusion: </strong>We developed pCART Explainer, a novel algorithm that highlights text within clinical notes to provide medically relevant context about deterioration alerts, thereby improving the explainability of the pCART model.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144700217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Swaminathan Kandaswamy, Naveen Muthu, Nikolay Braykov, Rebekah Carter, Reena Blanco, Thuy Bui, Evan Orenstein, Mark Mai
{"title":"Human performance evaluation of a pediatric artificial intelligence sepsis model.","authors":"Swaminathan Kandaswamy, Naveen Muthu, Nikolay Braykov, Rebekah Carter, Reena Blanco, Thuy Bui, Evan Orenstein, Mark Mai","doi":"10.1093/jamia/ocaf106","DOIUrl":"https://doi.org/10.1093/jamia/ocaf106","url":null,"abstract":"<p><strong>Objective: </strong>To assess the influence of an implemented artificial intelligence model predicting pediatric sepsis (defined by IPSO-Improving Pediatric Sepsis Outcomes collaborative) in the emergency department (ED) on human performance measures.</p><p><strong>Materials and methods: </strong>Two ED sites within a large pediatric health system in the Southeastern United States between January 1, 2021 and April 1, 2024. We interviewed ED providers and nurses within 72 hours of caring for a patient identified as potentially having sepsis by the predictive model. Thematic analysis of qualitative data was combined with electronic health record queries to assess measures of human performance, including situation awareness, explainability, human-computer agreement, workload, trust, automation bias, and relationship between staff and patients.</p><p><strong>Results: </strong>We interviewed 40 clinicians. Participants found that the sepsis alert improved situation awareness, leading to changes in patient care management, resource allocation, and/or monitoring. Participants reported an average trust in the model-based alert of 3.8/5. Only 28% (555/1977) of sepsis huddles were done without alert firing, suggesting some automation bias. Treatment with antibiotics for IPSO sepsis cases was similar pre- and post-intervention without a huddle (9.3% vs 10.5%), though treatment doubled with huddle intervention (22.7%). NASA Task Load Index increased from 43 to 57 post-intervention. There was no report of adverse relationships with patients post-intervention.</p><p><strong>Discussion: </strong>Human performance appeared to be generally positive with improved situation awareness and satisfaction with the alert-driven huddle. However, there was some evidence of automation bias and a slight increase in workload with the intervention.</p><p><strong>Conclusion: </strong>This study demonstrates the feasibility of evaluating multiple dimensions of human performance using a mixed methods approach for an AI model implemented in clinical practice. Future studies should aim to reduce the measurement burden of human performance metrics associated with AI implementation in acute care settings and assess the correlation between human performance measures and clinical outcomes.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144692230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Swaminathan Kandaswamy, Evan W Orenstein, Naveen Muthu, Andrea McCarter, Nikolay Braykov, Jonathan M Beus, Edwin Ray, Tal Senior, Sara P Brown, Rebekah Carter, MaryBeth Gleeson, Hannah Thummel, John Cheng, Thuy Bui, Reena Blanco, Kiran Hebbar, James Fortenberry, Srikant B Iyer, Mark V Mai
{"title":"Early clinical evaluation of a vendor developed pediatric artificial intelligence sepsis model in the emergency department.","authors":"Swaminathan Kandaswamy, Evan W Orenstein, Naveen Muthu, Andrea McCarter, Nikolay Braykov, Jonathan M Beus, Edwin Ray, Tal Senior, Sara P Brown, Rebekah Carter, MaryBeth Gleeson, Hannah Thummel, John Cheng, Thuy Bui, Reena Blanco, Kiran Hebbar, James Fortenberry, Srikant B Iyer, Mark V Mai","doi":"10.1093/jamia/ocaf105","DOIUrl":"https://doi.org/10.1093/jamia/ocaf105","url":null,"abstract":"<p><strong>Objective: </strong>To conduct an independent external validation of an implemented vendor-developed emergency department (ED) pediatric sepsis predictive model.</p><p><strong>Materials and methods: </strong>We performed a retrospective cross-sectional study within 2 ED sites of a large pediatric health system between January 1, 2021 and April 1, 2024. A nurse-facing interruptive alert appeared when the model score exceeded the threshold, triggering clinicians to call a sepsis huddle. We compared model predictive performance with vendor-reported performance using definitions that accounted for model threshold and alert timing in clinical practice. Care processes and patient outcome measures included time to first antibiotics, time to first fluid bolus, 30-day mortality, ED to ICU admission rate, and ICU free days.</p><p><strong>Results: </strong>The pre-intervention cohort consisted of 268 102 ED visits with 741 (0.28%) sepsis cases. The post-intervention cohort consisted of 331 061 ED visits with 1114 (0.34%) sepsis cases. Model predictive performance dropped from vendor-reported performance. Mean time to first antibiotic decreased from 112 to 102 minutes (P = .05, 95% confidence interval of difference, -19.1 to 0.1) and time to first bolus decreased by 16.7 minutes (P = .03, 95% confidence interval difference, -31.8 to -1.5) after the intervention. Decreases in 30-day mortality (6% [45/741] to 4% [52/1114]); ED to ICU admissions (87% [646/741] to 84% [941/1114]), and ICU free days (6 to 5) after the intervention did not meet statistical significance.</p><p><strong>Discussion: </strong>Implementing the model led to significant reductions in time to fluid bolus and borderline decreases in time to antibiotics, with non-significant changes in mortality and ICU metrics. When implementing an externally developed model, local workflows, documentation patterns, and patient populations make it challenging to generalize published or reported model performance metrics to real world performance.</p><p><strong>Conclusion: </strong>When tailoring a vendor-developed pediatric ED sepsis model for real-world usage, predictive performance differed substantially. Post-implementation we found improvements in care process measures, suggesting such models may benefit sepsis care when adapted for specific clinical workflows.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144692228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clair Blacketer, Frank J DeFalco, Mitchell M Conover, Patrick B Ryan, Martijn J Schuemie, Peter R Rijnbeek
{"title":"Evaluation of the impact of defining observable time in real-world data on outcome incidence.","authors":"Clair Blacketer, Frank J DeFalco, Mitchell M Conover, Patrick B Ryan, Martijn J Schuemie, Peter R Rijnbeek","doi":"10.1093/jamia/ocaf119","DOIUrl":"https://doi.org/10.1093/jamia/ocaf119","url":null,"abstract":"<p><strong>Objective: </strong>In real-world data (RWD), defining the observation period-the time during which a patient is considered observable-is critical for estimating incidence rates (IRs) and other outcomes. Yet, in the absence of explicit enrollment information, this period must often be inferred, introducing potential bias.</p><p><strong>Materials and methods: </strong>This study evaluates methods for defining observation periods and their impact on IR estimates across multiple database types. We applied 3 methods for defining observation periods: (1) a persistence + surveillance window approach, (2) an age- and gender-adjusted method based on time between healthcare events, and (3) the min/max method. These were tested across 11 RWD databases, including both enrollment-based and encounter-based sources. Enrollment time was used as the reference standard in eligible databases. To assess the impact on epidemiologic results, we replicated a prior study of adverse event incidence, comparing IRs and calculating mean squared error between methods.</p><p><strong>Results: </strong>Incidence rates decreased as observation periods lengthened, driven by increases in the person-time denominator. The persistence + surveillance method produced estimates closest to enrollment-based rates when appropriately balanced. The min/max approach yielded inconsistent results, particularly in encounter-based databases, with greater error observed in databases with longer time spans.</p><p><strong>Discussion: </strong>These findings suggest that assumptions about data completeness and population observability significantly affect incidence estimates. Observation period definitions substantially influence outcome measurement in RWD studies.</p><p><strong>Conclusion: </strong>Standardized, transparent approaches are necessary to ensure valid, reproducible results-especially in databases lacking defined enrollment.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144692229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Victor M Castro, Vivian S Gainer, Danielle M Crookes, Shawn N Murphy, Justin Manjourides
{"title":"Comparing patient-reported symptoms and structured clinician documentation in electronic health records.","authors":"Victor M Castro, Vivian S Gainer, Danielle M Crookes, Shawn N Murphy, Justin Manjourides","doi":"10.1093/jamia/ocaf112","DOIUrl":"https://doi.org/10.1093/jamia/ocaf112","url":null,"abstract":"<p><strong>Objectives: </strong>Real-world data (RWD) analyses primarily rely on structured clinical documentation collected through routine clinical care or driven by medical billing requirements. Patient-reported outcome measures (PROMs), integrated into electronic health records (EHRs), are an additional data source that could offer valuable insights into a patient's perspective and contribute to a more comprehensive understanding of health outcomes in RWD studies. This study aims to characterize agreement between PROMs symptoms and structured clinical documentation of these symptoms by clinicians in EHRs.</p><p><strong>Materials and methods: </strong>A cross-sectional study of 913 244 adult primary care annual physical visits between January 1, 2019 and December 31, 2023. We compared differences in prevalence and agreement of patient-reported symptoms (PRS) and structured clinician documentation (CD) across 15 respiratory, gastrointestinal, cardiometabolic, and neuropsychiatric symptoms.</p><p><strong>Results: </strong>Patient-reported symptom prevalence were significantly higher compared to CD across most symptoms including joint pain (33% PRS vs 12%), headaches (17% PRS vs 8.8% CD), and sleep disturbance (24% PRS vs 10% CD). Clinicians documented anxiety (11% PRS vs 23% CD) and depression (6.6% PRS vs 15.4% CD) symptoms using structured code at higher rates than patients reported them. Agreement between symptom self-report and clinician-documented structured codes was low to moderate (κ: 0.06-0.39).</p><p><strong>Discussion: </strong>Primary care patients self-report symptoms up to ten times more frequently than clinicians document them with structured codes in the EHR.</p><p><strong>Conclusion: </strong>This work demonstrates the value and feasibility of incorporating PRSs in RWD studies to reduce misclassification and more holistically capture a patient's health.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144664110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yaling Luo, Zerui Zhao, Xiaojuan Xu, Yueyan Zhao, Feng Yang
{"title":"The influence of recommendation algorithms on users' intention to adopt health information: does trust belief play a role?","authors":"Yaling Luo, Zerui Zhao, Xiaojuan Xu, Yueyan Zhao, Feng Yang","doi":"10.1093/jamia/ocaf115","DOIUrl":"https://doi.org/10.1093/jamia/ocaf115","url":null,"abstract":"<p><strong>Objectives: </strong>Recommendation systems have emerged as prevalent and effective tools. Investigating the impact of recommendation algorithms on users' health information adoption behavior can aid in optimizing health information services and advancing the construction and development of online health community platforms.</p><p><strong>Materials and methods: </strong>This study designed scenario experiments for social- and profile-oriented recommendations and collected data accordingly. The Theory of Knowledge-Based Trust was applied to explain users' trust beliefs in algorithmic recommendations. Nonparametric tests, logistic regression, and bootstrapping were used to test the variables' main, mediating, and moderating effects.</p><p><strong>Results: </strong>Social-oriented and profile-oriented recommendations were significantly correlated with users' intentions to adopt information. Competence belief (CB), benevolence belief (BB), and integrity belief (IB) mediated this relationship. Overall, the moderating effect of privacy concerns (PCs) is significant.</p><p><strong>Discussion: </strong>Both social- and profile-oriented recommendations can enhance users' willingness to adopt health information by facilitating their knowledge-based trust, with integrity beliefs playing a more substantial mediating role. Privacy concerns negatively moderate the impact of profile-oriented recommendations on benevolence and competence beliefs on information adoption intention.</p><p><strong>Conclusions: </strong>This study enriches the theoretical foundation of user health information adoption behavior in algorithmic recommendation contexts and provides new insights into the practice of health information on social media platforms.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144664071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diego A Forero, Sandra E Abreu, Blanca E Tovar, Marilyn H Oermann
{"title":"Automated analyses of risk of bias and critical appraisal of systematic reviews (ROBIS and AMSTAR 2): a comparison of the performance of 4 large language models.","authors":"Diego A Forero, Sandra E Abreu, Blanca E Tovar, Marilyn H Oermann","doi":"10.1093/jamia/ocaf117","DOIUrl":"https://doi.org/10.1093/jamia/ocaf117","url":null,"abstract":"<p><strong>Objectives: </strong>To explore the performance of 4 large language model (LLM) chatbots for the analysis of 2 of the most commonly used tools for the advanced analysis of systematic reviews (SRs) and meta-analyses.</p><p><strong>Materials and methods: </strong>We explored the performance of 4 LLM chatbots (ChatGPT, Gemini, DeepSeek, and QWEN) for the analysis of ROBIS and AMSTAR 2 tools (sample sizes: 20 SRs), in comparison with assessments by human experts.</p><p><strong>Results: </strong>Gemini showed the best agreement with human experts for both ROBIS and AMSTAR 2 (accuracy: 58% and 70%). The second best LLM chatbots were ChatGPT and QWEN, for ROBIS and AMSTAR 2, respectively.</p><p><strong>Discussion: </strong>Some LLM chatbots underestimated the risk of bias or overestimated the confidence of the results in published SRs, which is compatible with recent articles for other tools.</p><p><strong>Conclusion: </strong>This is one of the first studies comparing the performance of several LLM chatbots for the automated analyses of ROBIS and AMSTAR 2.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144664108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Woo Yeon Park, Teri Sippel Schmidt, Gabriel Salvador, Kevin O'Donnell, Brad Genereaux, Kyulee Jeon, Seng Chan You, Blake E Dewey, Paul Nagy
{"title":"Breaking data silos: incorporating the DICOM imaging standard into the OMOP CDM to enable multimodal research.","authors":"Woo Yeon Park, Teri Sippel Schmidt, Gabriel Salvador, Kevin O'Donnell, Brad Genereaux, Kyulee Jeon, Seng Chan You, Blake E Dewey, Paul Nagy","doi":"10.1093/jamia/ocaf091","DOIUrl":"https://doi.org/10.1093/jamia/ocaf091","url":null,"abstract":"<p><strong>Objective: </strong>This work incorporates the Digital Imaging Communications in Medicine (DICOM) Standard into the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) to standardize and accurately represent imaging studies, such as acquisition parameters, in multimodal research studies.</p><p><strong>Materials and methods: </strong>DICOM is the internationally adopted standard that defines entities and relationships for biomedical imaging data used for clinical imaging studies. Most of the complexity in the DICOM data structure centers around the metadata. This metadata contains information about the patient and the modality acquisition parameters. We parsed the DICOM vocabularies in Parts 3, 6, and 16 to obtain structured metadata definitions and added these as custom concepts in the OMOP CDM vocabulary. To validate our pipeline, we harvested and transformed DICOM metadata from magnetic resonance images in the Alzheimer's Disease Neuroimaging Initiative (ADNI) study.</p><p><strong>Results: </strong>We extracted and added 5183 attributes and 3628 coded values from the DICOM standard as custom concepts to the OMOP CDM vocabulary. We ingested 545 ADNI imaging studies containing 4756 series and harvested 691 224 metadata values. They were filtered, transformed, and loaded in the OMOP CDM imaging extension using the OMOP concepts for the DICOM attributes and values.</p><p><strong>Discussion: </strong>This work is adaptable to clinical DICOM data. Future work will validate scalability and incorporate outcomes from automated analysis to provide a complete characterization research study within the OMOP framework.</p><p><strong>Conclusion: </strong>The incorporation of medical imaging into clinical observational studies has been a barrier to multi model research. This work demonstrates detailed phenotypes and paves the way for observational multimodal research.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144664109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}