Xiayuan Huang, Jatin Arora, Abdullah Mesut Erzurumluoglu, Stephen A Stanhope, Daniel Lam, Hongyu Zhao, Zhihao Ding, Zuoheng Wang, Johann de Jong
{"title":"Enhancing patient representation learning with inferred family pedigrees improves disease risk prediction.","authors":"Xiayuan Huang, Jatin Arora, Abdullah Mesut Erzurumluoglu, Stephen A Stanhope, Daniel Lam, Hongyu Zhao, Zhihao Ding, Zuoheng Wang, Johann de Jong","doi":"10.1093/jamia/ocae297","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Machine learning and deep learning are powerful tools for analyzing electronic health records (EHRs) in healthcare research. Although family health history has been recognized as a major predictor for a wide spectrum of diseases, research has so far adopted a limited view of family relations, essentially treating patients as independent samples in the analysis.</p><p><strong>Methods: </strong>To address this gap, we present ALIGATEHR, which models inferred family relations in a graph attention network augmented with an attention-based medical ontology representation, thus accounting for the complex influence of genetics, shared environmental exposures, and disease dependencies.</p><p><strong>Results: </strong>Taking disease risk prediction as a use case, we demonstrate that explicitly modeling family relations significantly improves predictions across the disease spectrum. We then show how ALIGATEHR's attention mechanism, which links patients' disease risk to their relatives' clinical profiles, successfully captures genetic aspects of diseases using longitudinal EHR diagnosis data. Finally, we use ALIGATEHR to successfully distinguish the 2 main inflammatory bowel disease subtypes with highly shared risk factors and symptoms (Crohn's disease and ulcerative colitis).</p><p><strong>Conclusion: </strong>Overall, our results highlight that family relations should not be overlooked in EHR research and illustrate ALIGATEHR's great potential for enhancing patient representation learning for predictive and interpretable modeling of EHRs.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocae297","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Machine learning and deep learning are powerful tools for analyzing electronic health records (EHRs) in healthcare research. Although family health history has been recognized as a major predictor for a wide spectrum of diseases, research has so far adopted a limited view of family relations, essentially treating patients as independent samples in the analysis.
Methods: To address this gap, we present ALIGATEHR, which models inferred family relations in a graph attention network augmented with an attention-based medical ontology representation, thus accounting for the complex influence of genetics, shared environmental exposures, and disease dependencies.
Results: Taking disease risk prediction as a use case, we demonstrate that explicitly modeling family relations significantly improves predictions across the disease spectrum. We then show how ALIGATEHR's attention mechanism, which links patients' disease risk to their relatives' clinical profiles, successfully captures genetic aspects of diseases using longitudinal EHR diagnosis data. Finally, we use ALIGATEHR to successfully distinguish the 2 main inflammatory bowel disease subtypes with highly shared risk factors and symptoms (Crohn's disease and ulcerative colitis).
Conclusion: Overall, our results highlight that family relations should not be overlooked in EHR research and illustrate ALIGATEHR's great potential for enhancing patient representation learning for predictive and interpretable modeling of EHRs.
期刊介绍:
JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.