Colin M Rogerson, Christopher W Bartlett, John Price, Lang Li, Eneida A Mendonca, Shaun Grannis
{"title":"电子健康记录中母婴联动算法的推导与验证。","authors":"Colin M Rogerson, Christopher W Bartlett, John Price, Lang Li, Eneida A Mendonca, Shaun Grannis","doi":"10.1093/jamia/ocaf177","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>We created a probabilistic maternal-child electronic health record (EHR) linkage algorithm to promote clinical research in maternal-child health.</p><p><strong>Methods: </strong>We used EHR data from 1994 to 2024 to create an XGBoost model to predict maternal-child linkages. The model used standard EHR elements as predictor variables, including first name, last name, birthdate, address, phone number, email, and an EHR-embedded maternal-child indicator as the deterministic outcome.</p><p><strong>Results: </strong>From 82 million unique records, 6.2 billion potential pairs met blocking criteria. Of the potential pairs, 33 364 674 contained the deterministic indicator and were used as cases, and an equal number of controls were randomly sampled. The final model obtained an accuracy of 92%, a precision of 98%, a recall of 87%, and an F1-score of 92%.</p><p><strong>Conclusion: </strong>We derived and validated a probabilistic maternal-child linkage algorithm using routinely collected EHR data elements that could benefit future observational research in maternal-child health.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Derivation and validation of an algorithm for maternal-child linkage in electronic health records.\",\"authors\":\"Colin M Rogerson, Christopher W Bartlett, John Price, Lang Li, Eneida A Mendonca, Shaun Grannis\",\"doi\":\"10.1093/jamia/ocaf177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>We created a probabilistic maternal-child electronic health record (EHR) linkage algorithm to promote clinical research in maternal-child health.</p><p><strong>Methods: </strong>We used EHR data from 1994 to 2024 to create an XGBoost model to predict maternal-child linkages. The model used standard EHR elements as predictor variables, including first name, last name, birthdate, address, phone number, email, and an EHR-embedded maternal-child indicator as the deterministic outcome.</p><p><strong>Results: </strong>From 82 million unique records, 6.2 billion potential pairs met blocking criteria. Of the potential pairs, 33 364 674 contained the deterministic indicator and were used as cases, and an equal number of controls were randomly sampled. The final model obtained an accuracy of 92%, a precision of 98%, a recall of 87%, and an F1-score of 92%.</p><p><strong>Conclusion: </strong>We derived and validated a probabilistic maternal-child linkage algorithm using routinely collected EHR data elements that could benefit future observational research in maternal-child health.</p>\",\"PeriodicalId\":50016,\"journal\":{\"name\":\"Journal of the American Medical Informatics Association\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Medical Informatics Association\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://doi.org/10.1093/jamia/ocaf177\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocaf177","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Derivation and validation of an algorithm for maternal-child linkage in electronic health records.
Introduction: We created a probabilistic maternal-child electronic health record (EHR) linkage algorithm to promote clinical research in maternal-child health.
Methods: We used EHR data from 1994 to 2024 to create an XGBoost model to predict maternal-child linkages. The model used standard EHR elements as predictor variables, including first name, last name, birthdate, address, phone number, email, and an EHR-embedded maternal-child indicator as the deterministic outcome.
Results: From 82 million unique records, 6.2 billion potential pairs met blocking criteria. Of the potential pairs, 33 364 674 contained the deterministic indicator and were used as cases, and an equal number of controls were randomly sampled. The final model obtained an accuracy of 92%, a precision of 98%, a recall of 87%, and an F1-score of 92%.
Conclusion: We derived and validated a probabilistic maternal-child linkage algorithm using routinely collected EHR data elements that could benefit future observational research in maternal-child health.
期刊介绍:
JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.