JAMIA OpenPub Date : 2025-05-26eCollection Date: 2025-06-01DOI: 10.1093/jamiaopen/ooaf037
Lili M Schöler, Lisa Graf, Antti Airola, Alexander Ritzi, Michael Simon, Laura-Maria Peltonen
{"title":"Determining the ground truth for the prediction of delirium in adult patients in acute care: a scoping review.","authors":"Lili M Schöler, Lisa Graf, Antti Airola, Alexander Ritzi, Michael Simon, Laura-Maria Peltonen","doi":"10.1093/jamiaopen/ooaf037","DOIUrl":"10.1093/jamiaopen/ooaf037","url":null,"abstract":"<p><strong>Objective: </strong>Delirium is a severe condition, often underreported and linked to adverse outcomes such as increased mortality and prolonged hospitalization. Despite its significance, delirium prediction is often hindered by underreporting and inconsistent labeling, highlighting the need for models trained on reliably labeled data (ground truth). This review examines (i) practices for determining labels in delirium prediction models and (ii) how study designs affect label quality, aiming to identify key considerations for improving model reliability.</p><p><strong>Materials and methods: </strong>A search of Cochrane, PubMed, and IEEE identified 120 studies that met the inclusion criteria.</p><p><strong>Results: </strong>To establish the ground truth, 40.8% of studies used routine data, while 42.5% used primary data. The Confusion Assessment Method (CAM) was the most widely used assessment tool (60. 0%). Label and data leakage occurred in 35.0% of studies. High Risk of Bias (RoB) was a recurring issue, with 31.7% of studies lacking sufficient reporting and 36.7% showing inadequate outcome determination. Studies using primary data had lower RoB, whereas those with unclear label sources displayed higher RoB.</p><p><strong>Discussion: </strong>Our findings underscore the importance of careful planning in determining the ground truth frequently neglected in existing studies. To address these challenges, we provide a decision support flowchart to guide the development of more accurate and reliable prediction models.</p><p><strong>Conclusion: </strong>This review uncovers significant variability in labeling methods and discusses how this may affect delirium prediction model reliability. Highlighting the importance of addressing underreporting bias and providing guidance for developing more robust models.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 3","pages":"ooaf037"},"PeriodicalIF":2.5,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12105575/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144152154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
JAMIA OpenPub Date : 2025-04-26eCollection Date: 2025-04-01DOI: 10.1093/jamiaopen/ooaf032
Junghwan Lee, Simin Ma, Nicoleta Serban, Shihao Yang
{"title":"Accurate treatment effect estimation using inverse probability of treatment weighting with deep learning.","authors":"Junghwan Lee, Simin Ma, Nicoleta Serban, Shihao Yang","doi":"10.1093/jamiaopen/ooaf032","DOIUrl":"https://doi.org/10.1093/jamiaopen/ooaf032","url":null,"abstract":"<p><strong>Objectives: </strong>Observational data have been actively used to estimate treatment effect, driven by the growing availability of electronic health records (EHRs). However, EHRs typically consist of longitudinal records, often introducing time-dependent confounding that hinder the unbiased estimation of treatment effect. Inverse probability of treatment weighting (IPTW) is a widely used propensity score method since it provides unbiased treatment effect estimation and its derivation is straightforward. In this study, we aim to utilize IPTW to estimate treatment effect in the presence of time-dependent confounding using claims records.</p><p><strong>Materials and methods: </strong>Previous studies have utilized propensity score methods with features derived from claims records through feature processing, which generally requires domain knowledge and additional resources to extract information to accurately estimate propensity scores. Deep learning, particularly using deep sequence models such as recurrent neural networks and Transformer, has demonstrated good performance in modeling EHRs for various downstream tasks. We propose that these deep sequence models can provide accurate IPTW estimation of treatment effect by directly estimating the propensity scores from claims records without the need for feature processing.</p><p><strong>Results: </strong>Comprehensive evaluations on synthetic and semi-synthetic datasets demonstrate that IPTW treatment effect estimation using deep sequence models consistently outperforms baseline approaches, including logistic regression and multilayer perceptrons, combined with feature processing.</p><p><strong>Discussion: </strong>Our findings demonstrate that deep sequence models consistently outperform traditional approaches in estimating treatment effects, particularly under time-dependent confounding. Moreover, Transformer-based models offer interpretability by assigning higher attention weights to relevant confounders, even when prior domain knowledge is limited.</p><p><strong>Conclusion: </strong>Deep sequence models enable accurate treatment effect estimation through IPTW without the need for feature processing.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf032"},"PeriodicalIF":2.5,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12033031/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144042421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
JAMIA OpenPub Date : 2025-04-23eCollection Date: 2025-04-01DOI: 10.1093/jamiaopen/ooaf029
Tyler G James, Courtney W Mangus, Sarah J Parker, P Paul Chandanabhumma, C M Cassady, Fernanda Bellolio, Kalyan Pasupathy, Milisa Manojlovich, Hardeep Singh, Prashant Mahajan
{"title":"\"Everything is electronic health record-driven\": the role of the electronic health record in the emergency department diagnostic process.","authors":"Tyler G James, Courtney W Mangus, Sarah J Parker, P Paul Chandanabhumma, C M Cassady, Fernanda Bellolio, Kalyan Pasupathy, Milisa Manojlovich, Hardeep Singh, Prashant Mahajan","doi":"10.1093/jamiaopen/ooaf029","DOIUrl":"https://doi.org/10.1093/jamiaopen/ooaf029","url":null,"abstract":"<p><strong>Objectives: </strong>There is limited knowledge on how providers and patients in the emergency department (ED) use electronic health records (EHRs) to facilitate the diagnostic process. While EHRs can support diagnostic decision-making, EHR features that are not user-centered may increase the likelihood of diagnostic error. We aimed to identify how EHRs facilitate or impede the diagnostic process in the ED and to identify opportunities to reduce diagnostic errors and improve care quality.</p><p><strong>Materials and methods: </strong>We conducted semistructured interviews with 10 physicians, 15 nurses, and 8 patients across 4 EDs. Data were analyzed using a hybrid thematic analysis approach, which blends deductive (ie, using multiple conceptual frameworks) and inductive coding strategies. A team of 4 coders performed coding.</p><p><strong>Results: </strong>We identified 4 themes, 3 at the care team level and 1 at the patient level. At the care team level, the benefits of the EHR in the diagnostic process included (1) customizing features to facilitate diagnostic workup and (2) aiding in communication. However, (3) EHR-driven protocols were found to potentially burden the care process and reliance on asynchronous communication could impede team dynamics. At the patient-level, we found that (4) patient portals facilitated meaningful patient engagement through timely delivery of results.</p><p><strong>Discussion: </strong>While EHRs can improve the diagnostic process, they can also impair communication and increase workload. Electronic health record design should leverage provider-created tools to improve usability and enhance diagnostic safety.</p><p><strong>Conclusions: </strong>Our findings have important implications for health information technology design and policy. Further work should assess optimal ways to release patient results via the EHR portal.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf029"},"PeriodicalIF":2.5,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12015938/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144000007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
JAMIA OpenPub Date : 2025-04-15eCollection Date: 2025-04-01DOI: 10.1093/jamiaopen/ooaf028
{"title":"Correction to: Leveraging deep learning to detect stance in Spanish tweets on COVID-19 vaccination.","authors":"","doi":"10.1093/jamiaopen/ooaf028","DOIUrl":"https://doi.org/10.1093/jamiaopen/ooaf028","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.1093/jamiaopen/ooaf007.].</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf028"},"PeriodicalIF":2.5,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11999061/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144052043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
JAMIA OpenPub Date : 2025-04-10eCollection Date: 2025-04-01DOI: 10.1093/jamiaopen/ooaf026
Ruichen Rong, Zifan Gu, Hongyin Lai, Tanna L Nelson, Tony Keller, Clark Walker, Kevin W Jin, Catherine Chen, Ann Marie Navar, Ferdinand Velasco, Eric D Peterson, Guanghua Xiao, Donghan M Yang, Yang Xie
{"title":"A deep learning model for clinical outcome prediction using longitudinal inpatient electronic health records.","authors":"Ruichen Rong, Zifan Gu, Hongyin Lai, Tanna L Nelson, Tony Keller, Clark Walker, Kevin W Jin, Catherine Chen, Ann Marie Navar, Ferdinand Velasco, Eric D Peterson, Guanghua Xiao, Donghan M Yang, Yang Xie","doi":"10.1093/jamiaopen/ooaf026","DOIUrl":"https://doi.org/10.1093/jamiaopen/ooaf026","url":null,"abstract":"<p><strong>Objectives: </strong>Recent advances in deep learning show significant potential in analyzing continuous monitoring electronic health records (EHR) data for clinical outcome prediction. We aim to develop a Transformer-based, Encounter-level Clinical Outcome (TECO) model to predict mortality in the intensive care unit (ICU) using inpatient EHR data.</p><p><strong>Materials and methods: </strong>The TECO model was developed using multiple baseline and time-dependent clinical variables from 2579 hospitalized COVID-19 patients to predict ICU mortality and was validated externally in an acute respiratory distress syndrome cohort (<i>n</i> = 2799) and a sepsis cohort (<i>n</i> = 6622) from the Medical Information Mart for Intensive Care IV (MIMIC-IV). Model performance was evaluated based on the area under the receiver operating characteristic (AUC) and compared with Epic Deterioration Index (EDI), random forest (RF), and extreme gradient boosting (XGBoost).</p><p><strong>Results: </strong>In the COVID-19 development dataset, TECO achieved higher AUC (0.89-0.97) across various time intervals compared to EDI (0.86-0.95), RF (0.87-0.96), and XGBoost (0.88-0.96). In the 2 MIMIC testing datasets (EDI not available), TECO yielded higher AUC (0.65-0.77) than RF (0.59-0.75) and XGBoost (0.59-0.74). In addition, TECO was able to identify clinically interpretable features that were correlated with the outcome.</p><p><strong>Discussion: </strong>The TECO model outperformed proprietary metrics and conventional machine learning models in predicting ICU mortality among patients with COVID-19, widespread inflammation, respiratory illness, and other organ failures.</p><p><strong>Conclusion: </strong>The TECO model demonstrates a strong capability for predicting ICU mortality using continuous monitoring data. While further validation is needed, TECO has the potential to serve as a powerful early warning tool across various diseases in inpatient settings.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf026"},"PeriodicalIF":2.5,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11984207/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144048372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
JAMIA OpenPub Date : 2025-04-09eCollection Date: 2025-04-01DOI: 10.1093/jamiaopen/ooaf021
Liz Salmi, Dana M Lewis, Jennifer L Clarke, Zhiyong Dong, Rudy Fischmann, Emily I McIntosh, Chethan R Sarabu, Catherine M DesRoches
{"title":"A proof-of-concept study for patient use of open notes with large language models.","authors":"Liz Salmi, Dana M Lewis, Jennifer L Clarke, Zhiyong Dong, Rudy Fischmann, Emily I McIntosh, Chethan R Sarabu, Catherine M DesRoches","doi":"10.1093/jamiaopen/ooaf021","DOIUrl":"https://doi.org/10.1093/jamiaopen/ooaf021","url":null,"abstract":"<p><strong>Objectives: </strong>The use of large language models (LLMs) is growing for both clinicians and patients. While researchers and clinicians have explored LLMs to manage patient portal messages and reduce burnout, there is less documentation about how patients use these tools to understand clinical notes and inform decision-making. This proof-of-concept study examined the reliability and accuracy of LLMs in responding to patient queries based on an open visit note.</p><p><strong>Materials and methods: </strong>In a cross-sectional proof-of-concept study, 3 commercially available LLMs (ChatGPT 4o, Claude 3 Opus, Gemini 1.5) were evaluated using 4 distinct prompt series-<i>Standard</i>, <i>Randomized</i>, <i>Persona</i>, and <i>Randomized Persona</i>-with multiple questions, designed by patients, in response to a single neuro-oncology progress note. LLM responses were scored by the note author (neuro-oncologist) and a patient who receives care from the note author, using an 8-criterion rubric that assessed <i>Accuracy</i>, <i>Relevance</i>, <i>Clarity</i>, <i>Actionability</i>, <i>Empathy/Tone</i>, <i>Completeness</i>, <i>Evidence</i>, and <i>Consistency</i>. Descriptive statistics were used to summarize the performance of each LLM across all prompts.</p><p><strong>Results: </strong>Overall, the Standard and Persona-based prompt series yielded the best results across all criterion regardless of LLM. Chat-GPT 4o using Persona-based prompts scored highest in all categories. All LLMs scored low in the use of <i>Evidence</i>.</p><p><strong>Discussion: </strong>This proof-of-concept study highlighted the potential for LLMs to assist patients in interpreting open notes. The most effective LLM responses were achieved by applying <i>Persona</i>-style prompts to a patient's question.</p><p><strong>Conclusion: </strong>Optimizing LLMs for patient-driven queries, and patient education and counseling around the use of LLMs, have potential to enhance patient use and understanding of their health information.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf021"},"PeriodicalIF":2.5,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11980777/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144031741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
JAMIA OpenPub Date : 2025-04-03eCollection Date: 2025-04-01DOI: 10.1093/jamiaopen/ooaf023
David McMinn, Tom Grant, Laura DeFord-Watts, Veronica Porkess, Margarita Lens, Christopher Rapier, Wilson Q Joe, Timothy A Becker, Walter Bender
{"title":"Using artificial intelligence to expedite and enhance plain language summary abstract writing of scientific content.","authors":"David McMinn, Tom Grant, Laura DeFord-Watts, Veronica Porkess, Margarita Lens, Christopher Rapier, Wilson Q Joe, Timothy A Becker, Walter Bender","doi":"10.1093/jamiaopen/ooaf023","DOIUrl":"10.1093/jamiaopen/ooaf023","url":null,"abstract":"<p><strong>Objective: </strong>To assess the capacity of a bespoke artificial intelligence (AI) process to help medical writers efficiently generate quality plain language summary abstracts (PLSAs).</p><p><strong>Materials and methods: </strong>Three independent studies were conducted. In Studies 1 and 3, original scientific abstracts (OSAs; <i>n</i> = 48, <i>n</i> = 2) and corresponding PLSAs written by medical writers versus bespoke AI were assessed using standard readability metrics. Study 2 compared time and effort of medical writers (<i>n</i> = 10) drafting PLSAs starting with an OSA (<i>n</i> = 6) versus the output of 1 bespoke AI (<i>n</i> = 6) and 1 non-bespoke AI (<i>n</i> = 6) process. These PLSAs (<i>n</i> = 72) were assessed by subject matter experts (SMEs; <i>n</i> = 3) for accuracy and physicians (<i>n</i> = 7) for patient suitability. Lastly, in Study 3, medical writers (<i>n</i> = 22) and patients/patient advocates (<i>n </i>= 5) compared quality of medical writer and bespoke AI-generated PLSAs.</p><p><strong>Results: </strong>In Study 1, bespoke AI PLSAs were easier to read than medical writer PLSAs across all readability metrics (<i>P</i> <.01). In Study 2, bespoke AI output saved medical writers >40% in time for PLSA creation and required less effort than unassisted writing. SME-assessed quality was higher for AI-assisted PLSAs, and physicians preferred bespoke AI-generated outputs for patient use. In Study 3, bespoke AI PLSAs were more readable and rated of higher quality than medical writer PLSAs.</p><p><strong>Discussion: </strong>The bespoke AI process may enhance access to health information by helping medical writers produce PLSAs of scientific content that are fit for purpose.</p><p><strong>Conclusion: </strong>The bespoke AI process can more efficiently create better quality, more readable first draft PLSAs versus medical writer-generated PLSAs.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf023"},"PeriodicalIF":2.5,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11967854/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143781285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computer-assisted prescription of erythropoiesis-stimulating agents in patients undergoing maintenance hemodialysis: a randomized control trial for artificial intelligence model selection.","authors":"Lee-Moay Lim, Ming-Yen Lin, Chan Hsu, Chantung Ku, Yi-Pei Chen, Yihuang Kang, Yi-Wen Chiu","doi":"10.1093/jamiaopen/ooaf020","DOIUrl":"10.1093/jamiaopen/ooaf020","url":null,"abstract":"<p><strong>Objective: </strong>Machine learning (ML) algorithms are promising tools for managing anemia in hemodialysis (HD) patients. However, their efficacy in predicting erythropoiesis-stimulating agents (ESAs) doses remains uncertain. This study aimed to evaluate the effectiveness of a contemporary artificial intelligence (AI) model in prescribing ESA doses compared to physicians for HD patients.</p><p><strong>Materials and methods: </strong>This double-blinded control trial randomized participants into traditional doctor (Dr) and AI groups. In the Dr group, doses of ESA were determined by following clinical guideline recommendations, while in the AI group, they were predicted by the developed models named Random effects (REEM) trees, Mixed-effect random forest (MERF), Long short-term memory (LSTM) networks-I, and LSTM-II. The primary outcome was the capability to maintain patients' hemoglobin (Hb) value near 11 g/dL with a margin of 0.25 g/dL after treating the suggested ESA, with the secondary outcome being Hb value between 10 and 12 g/dL.</p><p><strong>Results: </strong>A total of 124 participants were enrolled, with 104 completing the study. The mean Hb values were 10.8 and 10.9 g/dL in the AI and Dr groups, respectively, with 69.7% and 73.5% of participants in the respective groups maintaining Hb levels between 10 and 12 g/dL. Only the REEM trees model passed the non-inferiority test for the primary outcome with a margin of 0.25 g/dL and the secondary outcome with a margin of 15%. There was no difference in severe adverse events between the 2 groups.</p><p><strong>Conclusion: </strong>The REEM trees AI model demonstrated non-inferiority to physicians in prescribing ESA doses for HD patients, maintaining Hb levels within the therapeutic target.</p><p><strong>Clinicaltrialsgov identifier: </strong>NCT04185519.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf020"},"PeriodicalIF":2.5,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11950923/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143755002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
JAMIA OpenPub Date : 2025-03-26eCollection Date: 2025-04-01DOI: 10.1093/jamiaopen/ooaf018
Jin-Ah Sim, Xiaolei Huang, Rachel T Webster, Kumar Srivastava, Kirsten K Ness, Melissa M Hudson, Justin N Baker, I-Chan Huang
{"title":"Leveraging natural language processing and machine learning to characterize psychological stress and life meaning and purpose in pediatric cancer survivors: a preliminary validation study.","authors":"Jin-Ah Sim, Xiaolei Huang, Rachel T Webster, Kumar Srivastava, Kirsten K Ness, Melissa M Hudson, Justin N Baker, I-Chan Huang","doi":"10.1093/jamiaopen/ooaf018","DOIUrl":"10.1093/jamiaopen/ooaf018","url":null,"abstract":"<p><strong>Objective: </strong>To determine if natural language processing (NLP) and machine learning (ML) techniques accurately identify interview-based psychological stress and meaning/purpose data in child/adolescent cancer survivors.</p><p><strong>Materials and methods: </strong>Interviews were conducted with 51 survivors (aged 8-17.9 years; ≥5-years post-therapy) from St Jude Children's Research Hospital. Two content experts coded 244 and 513 semantic units, focusing on attributes of psychological stress (anger, controllability/manageability, fear/anxiety) and attributes of meaning/purpose (goal, optimism, purpose). Content experts extracted specific attributes from the interviews, which were designated as the gold standard. Two NLP/ML methods, Word2Vec with Extreme Gradient Boosting (XGBoost), and Bidirectional Encoder Representations from Transformers Large (BERT<sub>Large</sub>), were validated using accuracy, areas under the receiver operating characteristic curves (AUROCC), and under the precision-recall curves (AUPRC).</p><p><strong>Results: </strong>BERT<sub>Large</sub> demonstrated higher accuracy, AUROCC, and AUPRC in identifying all attributes of psychological stress and meaning/purpose versus Word2Vec/XGBoost. BERT<sub>Large</sub> significantly outperformed Word2Vec/XGBoost in characterizing all attributes (<i>P</i> <.05) except for the purpose attribute of meaning/purpose.</p><p><strong>Discussion: </strong>These findings suggest that AI tools can help healthcare providers efficiently assess emotional well-being of childhood cancer survivors, supporting future clinical interventions.</p><p><strong>Conclusions: </strong>NLP/ML effectively identifies interview-based data for child/adolescent cancer survivors.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf018"},"PeriodicalIF":2.5,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11936487/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143721728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
JAMIA OpenPub Date : 2025-02-28eCollection Date: 2025-02-01DOI: 10.1093/jamiaopen/ooaf006
Nicholas C Wan, Monika E Grabowska, Vern Eric Kerchberger, Wei-Qi Wei
{"title":"Exploring beyond diagnoses in electronic health records to improve discovery: a review of the phenome-wide association study.","authors":"Nicholas C Wan, Monika E Grabowska, Vern Eric Kerchberger, Wei-Qi Wei","doi":"10.1093/jamiaopen/ooaf006","DOIUrl":"10.1093/jamiaopen/ooaf006","url":null,"abstract":"<p><strong>Objective: </strong>The phenome-wide association study (PheWAS) systematically examines the phenotypic spectrum extracted from electronic health records (EHRs) to uncover correlations between phenotypes and exposures. This review explores methodologies, highlights challenges, and outlines future directions for EHR-driven PheWAS.</p><p><strong>Materials and methods: </strong>We searched the PubMed database for articles spanning from 2010 to 2023, and we collected data regarding exposures, phenotypes, cohorts, terminologies, replication, and ancestry.</p><p><strong>Results: </strong>Our search yielded 690 articles. Following exclusion criteria, we identified 291 articles published between January 1, 2010, and December 31, 2023. A total number of 162 (55.6%) articles defined phenomes using phecodes, indicating that research is reliant on the organization of billing codes. Moreover, 72.8% of articles utilized exposures consisting of genetic data, and the majority (69.4%) of PheWAS lacked replication analyses.</p><p><strong>Discussion: </strong>Existing literature underscores the need for deeper phenotyping, variability in PheWAS exposure variables, and absence of replication in PheWAS. Current applications of PheWAS mainly focus on cardiovascular, metabolic, and endocrine phenotypes; thus, applications of PheWAS in uncommon diseases, which may lack structured data, remain largely understudied.</p><p><strong>Conclusions: </strong>With modern EHRs, future PheWAS should extend beyond diagnosis codes and consider additional data like clinical notes or medications to create comprehensive phenotype profiles that consider severity, temporality, risk, and ancestry. Furthermore, data interoperability initiatives may help mitigate the paucity of PheWAS replication analyses. With the growing availability of data in EHR, PheWAS will remain a powerful tool in precision medicine.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 1","pages":"ooaf006"},"PeriodicalIF":2.5,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11879097/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}