Yaling Luo, Zerui Zhao, Xiaojuan Xu, Yueyan Zhao, Feng Yang
{"title":"The influence of recommendation algorithms on users' intention to adopt health information: does trust belief play a role?","authors":"Yaling Luo, Zerui Zhao, Xiaojuan Xu, Yueyan Zhao, Feng Yang","doi":"10.1093/jamia/ocaf115","DOIUrl":"10.1093/jamia/ocaf115","url":null,"abstract":"<p><strong>Objectives: </strong>Recommendation systems have emerged as prevalent and effective tools. Investigating the impact of recommendation algorithms on users' health information adoption behavior can aid in optimizing health information services and advancing the construction and development of online health community platforms.</p><p><strong>Materials and methods: </strong>This study designed scenario experiments for social- and profile-oriented recommendations and collected data accordingly. The Theory of Knowledge-Based Trust was applied to explain users' trust beliefs in algorithmic recommendations. Nonparametric tests, logistic regression, and bootstrapping were used to test the variables' main, mediating, and moderating effects.</p><p><strong>Results: </strong>Social-oriented and profile-oriented recommendations were significantly correlated with users' intentions to adopt information. Competence belief (CB), benevolence belief (BB), and integrity belief (IB) mediated this relationship. Overall, the moderating effect of privacy concerns (PCs) is significant.</p><p><strong>Discussion: </strong>Both social- and profile-oriented recommendations can enhance users' willingness to adopt health information by facilitating their knowledge-based trust, with integrity beliefs playing a more substantial mediating role. Privacy concerns negatively moderate the impact of profile-oriented recommendations on benevolence and competence beliefs on information adoption intention.</p><p><strong>Conclusions: </strong>This study enriches the theoretical foundation of user health information adoption behavior in algorithmic recommendation contexts and provides new insights into the practice of health information on social media platforms.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1415-1424"},"PeriodicalIF":4.6,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12361861/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144664071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supporting clinical reasoning through visual summarization and presentation of patient data: a systematic review.","authors":"Hao Fan, Angela Hardi, Po-Yin Yen","doi":"10.1093/jamia/ocaf103","DOIUrl":"10.1093/jamia/ocaf103","url":null,"abstract":"<p><strong>Objectives: </strong>Clinicians retrieve data from electronic health record (EHR) systems and summarize them into clinical information to accomplish clinical reasoning and decision-making tasks. Visualization, using meaningful summarization methods and intuitive presentation approaches, can enhance this process. This systematic review examines how EHR data are summarized, visualized, and aligned with the 7 clinical reasoning and decision-making tasks shared by clinicians.</p><p><strong>Materials and methods: </strong>We searched 7 databases for research articles on individual patient EHR related to visualization, clinical decision-support, and patient summaries. Evidence from included studies was extracted for EHR data types, information summarization methods, visualization strategies, clinician characteristics, and evaluations. The synthesized evidence generated data-information-visualization (data-info-vis) flows.</p><p><strong>Results: </strong>We included 112 studies of which 70 (62.5%) conducted detailed usability evaluations, while 42 (37.5%) did not report any evaluations. Gaps remain in deriving actionable insights from EHR data, particularly for tasks requiring data quality reports. Three representative data-info-vis flows emerge. The first uses structured data to generate patterns for temporal visualizations, supporting tasks such as diagnosis and patient management. The second abstracts data into miniature charts, aiding situation-aware understanding and knowledge synthesis. The third features high-level visual metaphors for complex and overarching tasks, such as achieving better care.</p><p><strong>Discussion and conclusion: </strong>This review identifies 2 primary visualization strategies: (1) timeline-based presentations emphasizing temporal trends and longitudinal tracking, and (2) snapshot-based approaches focusing on status overviews and rapid assessments. The identified critical design approaches and distinct data-info-vis flows are tailored to clinical reasoning and decision-making tasks, offering insights for developing visualization-based decision-support tools.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1485-1498"},"PeriodicalIF":4.6,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12361860/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144530766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diego A Forero, Sandra E Abreu, Blanca E Tovar, Marilyn H Oermann
{"title":"Automated analyses of risk of bias and critical appraisal of systematic reviews (ROBIS and AMSTAR 2): a comparison of the performance of 4 large language models.","authors":"Diego A Forero, Sandra E Abreu, Blanca E Tovar, Marilyn H Oermann","doi":"10.1093/jamia/ocaf117","DOIUrl":"10.1093/jamia/ocaf117","url":null,"abstract":"<p><strong>Objectives: </strong>To explore the performance of 4 large language model (LLM) chatbots for the analysis of 2 of the most commonly used tools for the advanced analysis of systematic reviews (SRs) and meta-analyses.</p><p><strong>Materials and methods: </strong>We explored the performance of 4 LLM chatbots (ChatGPT, Gemini, DeepSeek, and QWEN) for the analysis of ROBIS and AMSTAR 2 tools (sample sizes: 20 SRs), in comparison with assessments by human experts.</p><p><strong>Results: </strong>Gemini showed the best agreement with human experts for both ROBIS and AMSTAR 2 (accuracy: 58% and 70%). The second best LLM chatbots were ChatGPT and QWEN, for ROBIS and AMSTAR 2, respectively.</p><p><strong>Discussion: </strong>Some LLM chatbots underestimated the risk of bias or overestimated the confidence of the results in published SRs, which is compatible with recent articles for other tools.</p><p><strong>Conclusion: </strong>This is one of the first studies comparing the performance of several LLM chatbots for the automated analyses of ROBIS and AMSTAR 2.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1471-1476"},"PeriodicalIF":4.6,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12361857/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144664108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clair Blacketer, Frank J DeFalco, Mitchell M Conover, Patrick B Ryan, Martijn J Schuemie, Peter R Rijnbeek
{"title":"Evaluation of the impact of defining observable time in real-world data on outcome incidence.","authors":"Clair Blacketer, Frank J DeFalco, Mitchell M Conover, Patrick B Ryan, Martijn J Schuemie, Peter R Rijnbeek","doi":"10.1093/jamia/ocaf119","DOIUrl":"10.1093/jamia/ocaf119","url":null,"abstract":"<p><strong>Objective: </strong>In real-world data (RWD), defining the observation period-the time during which a patient is considered observable-is critical for estimating incidence rates (IRs) and other outcomes. Yet, in the absence of explicit enrollment information, this period must often be inferred, introducing potential bias.</p><p><strong>Materials and methods: </strong>This study evaluates methods for defining observation periods and their impact on IR estimates across multiple database types. We applied 3 methods for defining observation periods: (1) a persistence + surveillance window approach, (2) an age- and gender-adjusted method based on time between healthcare events, and (3) the min/max method. These were tested across 11 RWD databases, including both enrollment-based and encounter-based sources. Enrollment time was used as the reference standard in eligible databases. To assess the impact on epidemiologic results, we replicated a prior study of adverse event incidence, comparing IRs and calculating mean squared error between methods.</p><p><strong>Results: </strong>Incidence rates decreased as observation periods lengthened, driven by increases in the person-time denominator. The persistence + surveillance method produced estimates closest to enrollment-based rates when appropriately balanced. The min/max approach yielded inconsistent results, particularly in encounter-based databases, with greater error observed in databases with longer time spans.</p><p><strong>Discussion: </strong>These findings suggest that assumptions about data completeness and population observability significantly affect incidence estimates. Observation period definitions substantially influence outcome measurement in RWD studies.</p><p><strong>Conclusion: </strong>Standardized, transparent approaches are necessary to ensure valid, reproducible results-especially in databases lacking defined enrollment.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1434-1444"},"PeriodicalIF":4.6,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12361855/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144692229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In defense of empathic informatics.","authors":"Harry Hochheiser, Shyam Visweswaran","doi":"10.1093/jamia/ocaf107","DOIUrl":"10.1093/jamia/ocaf107","url":null,"abstract":"<p><strong>Objectives: </strong>To explore the potential effects of recent restrictions on discussions regarding diversity, equity, and inclusion (DEI) in the field of biomedical informatics.</p><p><strong>Materials and methods: </strong>Executive orders issued by the U.S. federal government regarding diversity and gender issues are discussed in the context of implications for biomedical informatics research.</p><p><strong>Results: </strong>Restrictions on specific terminology can hinder research into critical topics such as bias and fairness in clinical artificial intelligence and machine learning algorithms. Additionally, these limitations may narrow the scope of questions that informatics research can address and obstruct efforts to enhance the diversity of perspectives within the field.</p><p><strong>Discussion: </strong>Responding to these threats requires a community response. The American Medical Informatics Association (AMIA) can help the informatics community present a united front in support of DEI research in multiple ways.</p><p><strong>Conclusion: </strong>The informatics community should take a strong and unambiguous response to support diversity, equity, and inclusion of underrepresented perspectives in the field.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1499-1502"},"PeriodicalIF":4.6,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12361854/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144612432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diagnostic accuracy differences in detecting wound maceration between humans and artificial intelligence: the role of human expertise revisited.","authors":"Florian Kücking, Ursula H Hübner, Dorothee Busch","doi":"10.1093/jamia/ocaf116","DOIUrl":"10.1093/jamia/ocaf116","url":null,"abstract":"<p><strong>Objective: </strong>This study aims to compare the diagnostic abilities of humans in wound image assessment with those of an AI-based model, examine how \"expertise\" affects clinicians' diagnostic performance, and investigate the heterogeneity in clinical judgments.</p><p><strong>Materials and methods: </strong>A total of 481 healthcare professionals completed a diagnostic task involving 30 chronic wound images with and without maceration. A convolutional neural network (CNN) classification model performed the same task. To predict human accuracy, participants' \"expertise,\" ie, pertinent formal qualification, work experience, self-confidence, and wound focus, was analyzed in a regression analysis. Human interrater reliability was calculated.</p><p><strong>Results: </strong>Human participants achieved an average accuracy of 79.3% and a maximum accuracy of 85% in the formally qualified group. Achieving 90% accuracy, the CNN performed better but not significantly. Pertinent formal qualification (β = 0.083, P < .001) and diagnostic self-confidence (β = 0.015, P = .002) significantly predicted human accuracy, while work experience and focus on wound care had no effect (R2 = 24.3%). Overall interrater reliability was \"fair\" (Kappa = 0.391).</p><p><strong>Discussion: </strong>Among the \"expertise\"-related factors, only the qualification and self-confidence variables influenced diagnostic accuracy. These findings challenge previous assumptions about work experience or job titles defining \"expertise\" and influencing human diagnostic performance.</p><p><strong>Conclusion: </strong>This study offers guidance to future studies when comparing human expert and AI task performance. However, to explain human diagnostic accuracy, \"expertise\" may only serve as one correlate, while additional factors need further research.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1425-1433"},"PeriodicalIF":4.6,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12361858/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144651087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seunghwan Kim, Robert Thombley, Elise Eiden, Sunny Lou, Julia Adler-Milstein, Thomas Kannampallil, A Jay Holmgren
{"title":"Differences in physician electronic health record use by telemedicine intensity: evidence from 2 academic medical centers.","authors":"Seunghwan Kim, Robert Thombley, Elise Eiden, Sunny Lou, Julia Adler-Milstein, Thomas Kannampallil, A Jay Holmgren","doi":"10.1093/jamia/ocaf122","DOIUrl":"10.1093/jamia/ocaf122","url":null,"abstract":"<p><strong>Objective: </strong>Evaluate the association between telemedicine intensity and ambulatory physician electronic health record (EHR) use following the COVID-19 pandemic.</p><p><strong>Materials and methods: </strong>This retrospective study included ambulatory physicians in 11 specialties at 2 large academic medical centers (Washington University in St Louis [WashU], University of California San Francisco [UCSF]). EHR use measures, including time-based and frequency-based, were analyzed in the post-COVID-19 period (March 1, 2021, through March 7, 2022). Multivariable regression models with 2-way fixed effects were used to assess the association between telemedicine intensity and EHR use.</p><p><strong>Results: </strong>Fully telemedicine physician-weeks were associated with higher EHR (hours per 8 patient scheduled hours; β = 3.2 at WashU, β = 1.4 at UCSF; P < .001) and documentation time (β = 2.7 at WashU, β = 1.4 at UCSF; P < .001). Several differences in discrete EHR-based tasks were observed: fully telemedicine physician-days were associated with lesser ordering, and there were mixed patterns for information seeking and clinical communication tasks.</p><p><strong>Discussion: </strong>Expanded use of telemedicine was associated with significant changes in physician EHR use post-COVID-19 onset. Increased EHR time may suggest a shift in workload, whereas decreased ordering may suggest constraints in virtual care, such as ability to perform physical examination and the reliance on patient-reported symptoms. Institutional differences usage patterns suggest that telemedicine's impact is context-specific and provides opportunities for understanding how to optimize EHRs to support telemedicine.</p><p><strong>Conclusion: </strong>Telemedicine shifts physician EHR. Supporting physicians through optimized EHR tools, tailored workflows, and team-based interventions is essential for sustainable virtual care delivery without exacerbating EHR burden.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1462-1470"},"PeriodicalIF":4.6,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12361853/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144620961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel Nycklemoe, Sriharsha Devarapu, Yanjun Gao, Kyle Carey, Nicholas Kuehnel, Neil Munjal, Priti Jani, Matthew Churpek, Dmitriy Dligach, Majid Afshar, Anoop Mayampurath
{"title":"Explaining alerts from a pediatric risk prediction model using clinical text.","authors":"Samuel Nycklemoe, Sriharsha Devarapu, Yanjun Gao, Kyle Carey, Nicholas Kuehnel, Neil Munjal, Priti Jani, Matthew Churpek, Dmitriy Dligach, Majid Afshar, Anoop Mayampurath","doi":"10.1093/jamia/ocaf121","DOIUrl":"10.1093/jamia/ocaf121","url":null,"abstract":"<p><strong>Objective: </strong>Risk prediction models are used in hospitals to identify pediatric patients at risk of clinical deterioration, enabling timely interventions and rescue. The objective of this study was to develop a new explainer algorithm that uses a patient's clinical notes to generate text-based explanations for risk prediction alerts.</p><p><strong>Materials and methods: </strong>We conducted a retrospective study of 39 406 patient admissions to the American Family Children's Hospital at the University of Wisconsin-Madison (2009-2020). The pediatric Calculated Assessment of Risk and Triage (pCART) validated risk prediction model was used to identify children at risk for deterioration. A transformer model was trained to use clinical notes from the 12-hour period preceding each pCART score to predict whether a patient was flagged as at risk. Then, label-aware attention highlighted text phrases most important to an at-risk alert. The study cohort was randomly split into derivation (60%) and validation (20%) data, and a separate test (20%) was used to evaluate the explainer's performance.</p><p><strong>Results: </strong>Our pCART Explainer algorithm performed well in discriminating at-risk pCART alert vs no alert (c-statistic 0.805). Sample explanations from pCART Explainer revealed clinically important phrases such as \"rapid breathing,\" \"fall risk,\" \"distension,\" and \"grunting,\" thereby demonstrating excellent face validity.</p><p><strong>Discussion: </strong>The pCART Explainer could quickly orient clinicians to the patient's condition by drawing attention to key phrases in notes, potentially enhancing situational awareness and guiding decision-making.</p><p><strong>Conclusion: </strong>We developed pCART Explainer, a novel algorithm that highlights text within clinical notes to provide medically relevant context about deterioration alerts, thereby improving the explainability of the pCART model.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1445-1453"},"PeriodicalIF":4.6,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12361859/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144700217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hugo Guillen-Ramirez, Daniel Sanchez-Taltavull, Stéphanie Perrodin, Sarah Peisl, Karen Triep, Christophe Gaudet-Blavignac, Olga Endrich, Guido Beldi
{"title":"Prediction of postoperative infections by strategic data imputation and explainable machine learning.","authors":"Hugo Guillen-Ramirez, Daniel Sanchez-Taltavull, Stéphanie Perrodin, Sarah Peisl, Karen Triep, Christophe Gaudet-Blavignac, Olga Endrich, Guido Beldi","doi":"10.1093/jamia/ocaf145","DOIUrl":"https://doi.org/10.1093/jamia/ocaf145","url":null,"abstract":"<p><strong>Objectives: </strong>Infections following healthcare-associated interventions drive patient morbidity and mortality, making early detection essential. Traditional predictive models utilize preoperative surgical characteristics. This study evaluated whether integrating postoperative laboratory values and their kinetics could improve outcome prediction.</p><p><strong>Materials and methods: </strong>91 794 surgical cases were extracted from electronic health records (EHR) and analyzed to predict bacterial infection as the endpoint. The endpoint was documented in the EHR as ICD-10 by a professional coding team. Variables were grouped as preoperative, intraoperative, or postoperative. Strategic imputation was used for postoperative missing laboratory values. Procedure-agnostic prediction models were built incorporating both static and kinetic properties of laboratory values.</p><p><strong>Results: </strong>The integration of kinetics of laboratory values into a machine learning predictor achieved a recall, precision and ROC AUC at postoperative day 2 of 0.71, 0.69, and 0.83, respectively. Moreover, infection detection outperformed clinician-based decision-making, as reflected by the postoperative timing of antibiotic administration. The analysis identified previously unknown, informative combinations of routine markers from hepatic, renal, and bone marrow functions that predict outcome.</p><p><strong>Discussion: </strong>Dynamic modelling of postoperative laboratory values enhanced the timeliness and accuracy of infection detection compared with static or preoperative-only models. The integration of explainable machine learning supports clinical interpretation and highlights the contribution of multiple organ systems to postoperative infection risk.</p><p><strong>Conclusion: </strong>A surgery-independent workflow integrating time-series values from laboratory parameters to enhance baseline predictors of infection. This interpretable approach is generalizable across procedures and has the potential to optimize patient outcomes and resource use in surgical care.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144976084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rui Yang, Jiayi Tong, Haoyuan Wang, Hui Huang, Ziyang Hu, Peiyu Li, Nan Liu, Christopher J Lindsell, Michael J Pencina, Yong Chen, Chuan Hong
{"title":"Enabling inclusive systematic reviews: incorporating preprint articles with large language model-driven evaluations.","authors":"Rui Yang, Jiayi Tong, Haoyuan Wang, Hui Huang, Ziyang Hu, Peiyu Li, Nan Liu, Christopher J Lindsell, Michael J Pencina, Yong Chen, Chuan Hong","doi":"10.1093/jamia/ocaf137","DOIUrl":"https://doi.org/10.1093/jamia/ocaf137","url":null,"abstract":"<p><strong>Objectives: </strong>Systematic reviews in comparative effectiveness research require timely evidence synthesis. With the rapid advancement of medical research, preprint articles play an increasingly important role in accelerating knowledge dissemination. However, as preprint articles are not peer-reviewed before publication, their quality varies significantly, posing challenges for evidence inclusion in systematic reviews.</p><p><strong>Materials and methods: </strong>We developed AutoConfidenceScore (automated confidence score assessment), an advanced framework for predicting preprint publication, which reduces reliance on manual curation and expands the range of predictors, including three key advancements: (1) automated data extraction using natural language processing techniques, (2) semantic embeddings of titles and abstracts, and (3) large language model (LLM)-driven evaluation scores. Additionally, we employed two prediction models: a random forest classifier for binary outcome and a survival cure model that predicts both binary outcome and publication risk over time.</p><p><strong>Results: </strong>The random forest classifier achieved an area under the receiver operating characteristic curve (AUROC) of 0.747 using all features. The survival cure model achieved an AUROC of 0.731 for binary outcome prediction and a concordance index of 0.667 for time-to-publication risk.</p><p><strong>Discussion: </strong>Our study advances the framework for preprint publication prediction through automated data extraction and multiple feature integration. By combining semantic embeddings with LLM-driven evaluations, AutoConfidenceScore significantly enhances predictive performance while reducing manual annotation burden.</p><p><strong>Conclusion: </strong>AutoConfidenceScore has the potential to facilitate incorporation of preprint articles during the appraisal phase of systematic reviews, supporting researchers in more effective utilization of preprint resources.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144975882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}