{"title":"Comparative Analysis of Fusion Strategies for Imaging and Non-imaging Data - Use-case of Hospital Discharge Prediction.","authors":"Vedant Parikh, Amara Tariq, Bhavik Patel, Imon Banerjee","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Accurate prediction of future clinical events such as discharge from hospital can not only improve hospital resource management but also provide an indicator of a patient's clinical condition. Within the scope of this work, we perform a comparative analysis of deep learning based fusion strategies against traditional single source models for prediction of discharge from hospital by fusing information encoded in two diverse but relevant data modalities, i.e., chest X-ray images and tabular electronic health records (EHR). We evaluate multiple fusion strategies including late, early and joint fusion in terms of their efficacy for target prediction compared to EHR-only and Image-only predictive models. Results indicated the importance of merging information from two modalities for prediction as fusion models tended to outperform single modality models and indicate that the joint fusion scheme was the most effective for target prediction. Joint fusion model merges the two modalities through a branched neural network that is jointly trained in an end-to-end fashion to extract target-relevant information from both modalities.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"652-661"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141810/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141199535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Development and Validation of an Individual Socioeconomic Deprivation Index (ISDI) in the NIH's <i>All of Us</i> Data Network.","authors":"Nripendra Acharya, Karthik Natarajan","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Many of the existing composite social determinant of health indices, such as Area Deprivation Index, are constrained by their reliance on geographic approximations and American Community Survey data. This study builds on the body of literature around deprivation indices to construct an individual socioeconomic deprivation index (ISDI) within the NIH's All of Us Data Network by using weighted multiple correspondence analysis on SDOH data elements collected at the participant level. In this study, the correlation between ISDI and another area-approximated index is assessed to the extent possible, along with the changes in an AI models performance due to stratified sampling based on ISDI quintiles. Individual level deprivation indices may have a wide range of utility particularly in the context of precision medicine in both centralized and distributed data networks.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"36-45"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141807/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samaya S Badrieh, Lisa Bastarache, Xinnan Niu, Jing He, Jamie R Robinson
{"title":"Driving Precision of Pediatric VTE Risk-stratification through Genetics.","authors":"Samaya S Badrieh, Lisa Bastarache, Xinnan Niu, Jing He, Jamie R Robinson","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This study addresses rising incidence of pediatric venous thromboembolism by validating a VTE phenotype and developing a polygenic risk score (PRS) using UK Biobank data. Our findings demonstrate predictive value of the PRS, enhancing VTE risk assessment in clinical settings. Future steps involve integrating the PRS into risk stratification models.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"498"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141856/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kary Ishwaran, Bryan Q Abadie, Po-Hao Chen, Michael Bolen, Tara Karamlou, Richard Grimm, W H Wilson Tang, Christopher Nguyen, Deborah Kwon, David Chen
{"title":"Pre-test Prediction of Non-ischemic Cardiomyopathies using Time-Series EHR Data.","authors":"Kary Ishwaran, Bryan Q Abadie, Po-Hao Chen, Michael Bolen, Tara Karamlou, Richard Grimm, W H Wilson Tang, Christopher Nguyen, Deborah Kwon, David Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Clinical imaging is an important diagnostic test to diagnose non-ischemic cardiomyopathies (NICM). However, accurate interpretation of imaging studies often requires readers to review patient histories, a time consuming and tedious task. We propose to use time-series analysis to predict the most likely NICMs using longitudinal electronic health records (EHR) as a pseudo-summary of EHR records. Time-series formatted EHR data can provide temporality information important towards accurate prediction of disease. Specifically, we leverage ICD-10 codes and various recurrent neural network architectures for predictive modeling. We trained our models on a large cohort of NICM patients who underwent cardiac magnetic resonance imaging (CMR) and a smaller cohort undergoing echocardiogram. The performance of the proposed technique achieved good micro-area under the curve (0.8357), F1 score (0.5708) and precision at 3 (0.8078) across all models for cardiac magnetic resonance imaging (CMR) but only moderate performance for transthoracic echocardiogram (TTE) of 0.6938, 0.4399 and 0.5864 respectively. We show that our model has the potential to provide accurate pre-test differential diagnosis, thereby potentially reducing clerical burden on physicians.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"239-248"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141858/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yinan Liu, Xinyu Dong, Weimin Lyu, Richard N Rosenthal, Rachel Wong, Tengfei Ma, Jun Kong, Fusheng Wang
{"title":"Enhancing Clinical Predictive Modeling through Model Complexity-Driven Class Proportion Tuning for Class Imbalanced Data: An Empirical Study on Opioid Overdose Prediction.","authors":"Yinan Liu, Xinyu Dong, Weimin Lyu, Richard N Rosenthal, Rachel Wong, Tengfei Ma, Jun Kong, Fusheng Wang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Class imbalance issues are prevalent in the medical field and significantly impact the performance of clinical predictive models. Traditional techniques to address this challenge aim to rebalance class proportions. They generally assume that the rebalanced proportions are derived from the original data, without considering the intricacies of the model utilized. This study challenges the prevailing assumption and introduces a new method that ties the optimal class proportions to model complexity. This approach allows for individualized tuning of class proportions for each model. Our experiments, centered on the opioid overdose prediction problem, highlight the performance gains achieved by this approach. Furthermore, rigorous regression analysis affirms the merits of the proposed theoretical framework, demonstrating a statistically significant correlation between hyperparameters controlling model complexity and the optimal class proportions.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"334-343"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141828/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joshua W Anderson, Nader Shaikh, Shyam Visweswaran
{"title":"Measuring and Reducing Racial Bias in a Pediatric Urinary Tract Infection Model.","authors":"Joshua W Anderson, Nader Shaikh, Shyam Visweswaran","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Clinical predictive models that include race as a predictor have the potential to exacerbate disparities in healthcare. Such models can be respecified to exclude race or optimized to reduce racial bias. We investigated the impact of such respecifications in a predictive model - UTICalc - which was designed to reduce catheterizations in young children with suspected urinary tract infections. To reduce racial bias, race was removed from the UTICalc logistic regression model and replaced with two new features. We compared the two versions of UTICalc using fairness and predictive performance metrics to understand the effects on racial bias. In addition, we derived three new models for UTICalc to specifically improve racial fairness. Our results show that, as predicted by previously described impossibility results, fairness cannot be simultaneously improved on all fairness metrics, and model respecification may improve racial fairness but decrease overall predictive performance.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"488-497"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141814/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Best of Both Worlds: Bridging One Model for All and Group-Specific Model Approaches using Ensemble-based Subpopulation Modeling.","authors":"Purity Mugambi, Stephanie Carreiro","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Subpopulation models have become of increasing interest in prediction of clinical outcomes because they promise to perform better for underrepresented patient subgroups. However, the personalization benefits gained from these models tradeoff their statistical power, and can be impractical when the subpopulation's sample size is small. We hypothesize that a hierarchical model in which population information is integrated into subpopulation models would preserve the personalization benefits and offset the loss of power. In this work, we integrate ideas from ensemble modeling, personalization, and hierarchical modeling and build ensemble-based subpopulation models in which specialization relies on whole group samples. This approach significantly improves the precision of the positive class, especially for the underrepresented subgroups, with minimal cost to the recall. It consistently outperforms one model for all and one model for each subgroup approaches, especially in the presence of a high class-imbalance, for subgroups with at least 380 training samples.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"354-363"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141864/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katrina Bazemore, Jaehyun Joo, Wei-Ting Hwang, Blanca E Himes
{"title":"Clarifying Chronic Obstructive Pulmonary Disease Genetic Associations Observed in Biobanks via Mediation Analysis of Smoking.","authors":"Katrina Bazemore, Jaehyun Joo, Wei-Ting Hwang, Blanca E Himes","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Varying case definitions of COPD have heterogenous genetic risk profiles, potentially reflective of disease subtypes or classification bias (e.g., smokers more likely to be diagnosed with COPD). To better understand differences in genetic loci associated with ICD-defined versus spirometry-defined COPD we contrasted their GWAS results with those for heavy smoking among 337,138 UK Biobank participants. Overlapping risk loci were found in/near the genes ZEB2, FAM136B, CHRNA3, and CHRNA4, with the CHRNA3 locus shared across all three traits. Mediation analysis to estimate the effects of lead genotyped variants mediated by smoking found significant indirect effects for the FAM136B, CHRNA3, and CHRNA4 loci for both COPD definitions. Adjustment for mediator-outcome confounders modestly attenuated indirect effects, though in the CHRNA4 locus for spirometry-defined COPD the proportion mediated increased an additional 8.47%. Our results suggest that differences between ICD-defined and spirometry-defined COPD associated genetic loci are not a result of smoking biasing classification.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"499-508"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141825/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141198537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bojian Hou, Andrés Mondragón, Davoud Ataee Tarzanagh, Zhuoping Zhou, Andrew J Saykin, Jason H Moore, Marylyn D Ritchie, Qi Long, Li Shen
{"title":"PFERM: A Fair Empirical Risk Minimization Approach with Prior Knowledge.","authors":"Bojian Hou, Andrés Mondragón, Davoud Ataee Tarzanagh, Zhuoping Zhou, Andrew J Saykin, Jason H Moore, Marylyn D Ritchie, Qi Long, Li Shen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Fairness is crucial in machine learning to prevent bias based on sensitive attributes in classifier predictions. However, the pursuit of strict fairness often sacrifices accuracy, particularly when significant prevalence disparities exist among groups, making classifiers less practical. For example, Alzheimer's disease (AD) is more prevalent in women than men, making equal treatment inequitable for females. Accounting for prevalence ratios among groups is essential for fair decision-making. In this paper, we introduce prior knowledge for fairness, which incorporates prevalence ratio information into the fairness constraint within the Empirical Risk Minimization (ERM) framework. We develop the Prior-knowledge-guided Fair ERM (PFERM) framework, aiming to minimize expected risk within a specified function class while adhering to a prior-knowledge-guided fairness constraint. This approach strikes a flexible balance between accuracy and fairness. Empirical results confirm its effectiveness in preserving fairness without compromising accuracy.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"211-220"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141835/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated HIV Case Identification from the MIMIC-IV Database.","authors":"Kai Jiang, Tru Cao","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Automatic HIV phenotyping is needed for HIV research based on electronic health records (EHRs). MIMIC-IV, an extension of MIMIC-III, contains more than 520,000 hospital admissions and has become a valuable EHR database for secondary medical research. However, there was no prior phenotyping algorithm to extract HIV cases from MIMIC-IV, which requires a comprehensive knowledge of the database. Moreover, previous HIV phenotyping algorithms did not consider the new HIV-1/HIV-2 antibody differentiation immunoassay tests that MIMIC-IV contains. Our work provided insight into the structure and data elements in MIMIC-IV and proposed a new HIV phenotyping algorithm to fill in these gaps. The results included MIMIC-IV's data tables and elements used, 1,781 and 1,843 HIV cases from MIMIC-IV's versions 0.4 and 2.1, respectively, and summary statistics of these two HIV case cohorts. They could be used for the development of statistical and machine learning models in future studies about the disease.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"555-564"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141847/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}