Hang Xu, Pierangelo Renella, Ramin Badiyan, Ziad R Hindosh, Francisco X Elisarraras, Bing Zhu, Gary M Satou, Majid Husain, J Paul Finn, William Hsu, Kim-Lien Nguyen
{"title":"使用电子健康记录进行单心室生理学分类的表型算法。","authors":"Hang Xu, Pierangelo Renella, Ramin Badiyan, Ziad R Hindosh, Francisco X Elisarraras, Bing Zhu, Gary M Satou, Majid Husain, J Paul Finn, William Hsu, Kim-Lien Nguyen","doi":"10.1093/jamiaopen/ooaf035","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Congenital heart disease (CHD) patients with single ventricle physiology (SVP) have heterogeneous characteristics that challenge cohort classification. We aim to develop a phenotyping algorithm that accurately identifies SVP patients using electronic health record (EHR) data.</p><p><strong>Materials and methods: </strong>We used ICD-9 and ICD-10 codes for initial classification, then enhanced the algorithm with domain expertise, imaging reports, and progress notes. The algorithm was developed using a cohort of 1020 patients who underwent magnetic resonance imaging scans and tested in a separate cohort of 2500 CHD patients with adjudication. Validation was performed in a holdout group of 22 500 CHD patients. We evaluated performance using accuracy, sensitivity, precision, and F1 score, and compared it to a published algorithm for SVP using the same dataset.</p><p><strong>Results: </strong>In the 2500-testing cohort, our algorithm based on specialty-defined features and International Classification of Diseases (ICD) codes achieved 99.24% accuracy, 94.12% precision, 85.11% sensitivity, and 89.39% F1 score. In contrast, the published method achieved 95.20% accuracy, 43.23% precision, 88.30% sensitivity, and 58.04% F1 score. In the 22 500-validation cohort, our algorithm achieved 93.82% precision, while the published method achieved 43.00%.</p><p><strong>Discussion and conclusions: </strong>Our automated phenotype algorithm, combined with physician adjudication, outperforms a published method for SVP classification. It effectively identifies false positives by cross-referencing clinical notes and detects missed SVP cases that were due to absent or erroneous ICD codes. Our integrated phenotyping algorithm showed excellent performance and has the potential to improve research and clinical care of SVP patients through the automated development of an electronic cohort for prognostication, monitoring, and management.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 3","pages":"ooaf035"},"PeriodicalIF":3.4000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12080993/pdf/","citationCount":"0","resultStr":"{\"title\":\"A phenotyping algorithm for classification of single ventricle physiology using electronic health records.\",\"authors\":\"Hang Xu, Pierangelo Renella, Ramin Badiyan, Ziad R Hindosh, Francisco X Elisarraras, Bing Zhu, Gary M Satou, Majid Husain, J Paul Finn, William Hsu, Kim-Lien Nguyen\",\"doi\":\"10.1093/jamiaopen/ooaf035\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>Congenital heart disease (CHD) patients with single ventricle physiology (SVP) have heterogeneous characteristics that challenge cohort classification. We aim to develop a phenotyping algorithm that accurately identifies SVP patients using electronic health record (EHR) data.</p><p><strong>Materials and methods: </strong>We used ICD-9 and ICD-10 codes for initial classification, then enhanced the algorithm with domain expertise, imaging reports, and progress notes. The algorithm was developed using a cohort of 1020 patients who underwent magnetic resonance imaging scans and tested in a separate cohort of 2500 CHD patients with adjudication. Validation was performed in a holdout group of 22 500 CHD patients. We evaluated performance using accuracy, sensitivity, precision, and F1 score, and compared it to a published algorithm for SVP using the same dataset.</p><p><strong>Results: </strong>In the 2500-testing cohort, our algorithm based on specialty-defined features and International Classification of Diseases (ICD) codes achieved 99.24% accuracy, 94.12% precision, 85.11% sensitivity, and 89.39% F1 score. In contrast, the published method achieved 95.20% accuracy, 43.23% precision, 88.30% sensitivity, and 58.04% F1 score. In the 22 500-validation cohort, our algorithm achieved 93.82% precision, while the published method achieved 43.00%.</p><p><strong>Discussion and conclusions: </strong>Our automated phenotype algorithm, combined with physician adjudication, outperforms a published method for SVP classification. It effectively identifies false positives by cross-referencing clinical notes and detects missed SVP cases that were due to absent or erroneous ICD codes. Our integrated phenotyping algorithm showed excellent performance and has the potential to improve research and clinical care of SVP patients through the automated development of an electronic cohort for prognostication, monitoring, and management.</p>\",\"PeriodicalId\":36278,\"journal\":{\"name\":\"JAMIA Open\",\"volume\":\"8 3\",\"pages\":\"ooaf035\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12080993/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JAMIA Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jamiaopen/ooaf035\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/6/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMIA Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jamiaopen/ooaf035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
A phenotyping algorithm for classification of single ventricle physiology using electronic health records.
Objectives: Congenital heart disease (CHD) patients with single ventricle physiology (SVP) have heterogeneous characteristics that challenge cohort classification. We aim to develop a phenotyping algorithm that accurately identifies SVP patients using electronic health record (EHR) data.
Materials and methods: We used ICD-9 and ICD-10 codes for initial classification, then enhanced the algorithm with domain expertise, imaging reports, and progress notes. The algorithm was developed using a cohort of 1020 patients who underwent magnetic resonance imaging scans and tested in a separate cohort of 2500 CHD patients with adjudication. Validation was performed in a holdout group of 22 500 CHD patients. We evaluated performance using accuracy, sensitivity, precision, and F1 score, and compared it to a published algorithm for SVP using the same dataset.
Results: In the 2500-testing cohort, our algorithm based on specialty-defined features and International Classification of Diseases (ICD) codes achieved 99.24% accuracy, 94.12% precision, 85.11% sensitivity, and 89.39% F1 score. In contrast, the published method achieved 95.20% accuracy, 43.23% precision, 88.30% sensitivity, and 58.04% F1 score. In the 22 500-validation cohort, our algorithm achieved 93.82% precision, while the published method achieved 43.00%.
Discussion and conclusions: Our automated phenotype algorithm, combined with physician adjudication, outperforms a published method for SVP classification. It effectively identifies false positives by cross-referencing clinical notes and detects missed SVP cases that were due to absent or erroneous ICD codes. Our integrated phenotyping algorithm showed excellent performance and has the potential to improve research and clinical care of SVP patients through the automated development of an electronic cohort for prognostication, monitoring, and management.