应用外部开发的算法从电子健康记录数据中识别研究案例和控制：失败与成功。

IF 2.1 2区医学 Q4 MEDICAL INFORMATICS

Applied Clinical Informatics Pub Date : 2025-03-01 Epub Date: 2025-01-24 DOI:10.1055/a-2524-5216

Nelly Estefanie Garduno-Rapp, Simone Herzberg, Henry H Ong, Cindy Kao, Christoph U Lehmann, Srushti Gangireddy, Nitin B Jain, Ayush Giri

{"title":"应用外部开发的算法从电子健康记录数据中识别研究案例和控制：失败与成功。","authors":"Nelly Estefanie Garduno-Rapp, Simone Herzberg, Henry H Ong, Cindy Kao, Christoph U Lehmann, Srushti Gangireddy, Nitin B Jain, Ayush Giri","doi":"10.1055/a-2524-5216","DOIUrl":null,"url":null,"abstract":"The use of electronic health records (EHRs) in research demands robust and interoperable systems. By linking biorepositories to EHR algorithms, researchers can efficiently identify cases and controls for large observational studies (e.g., genome-wide association studies). This is critical for ensuring efficient and cost-effective research. However, the lack of standardized metadata and algorithms across different EHRs complicates their sharing and application. Our study presents an example of a successful implementation and validation process.This study aimed to implement and validate a rule-based algorithm from a tertiary medical center in Tennessee to classify cases and controls from a research study on rotator cuff tear (RCT) nested within a tertiary medical center in North Texas and to assess the algorithm's performance.We applied a phenotypic algorithm (designed and validated in a tertiary medical center in Tennessee) using EHR data from 492 patients enrolled in a case-control study recruited from a tertiary medical center in North Texas. The algorithm leveraged the international classification of diseases and current procedural terminology codes to identify case and control status for degenerative RCT. A manual review was conducted to compare the algorithm's classification with a previously recorded gold standard documented by clinical researchers.Initially the algorithm identified 398 (80.9%) patients correctly as cases or controls. After fine-tuning and correcting errors in our gold standard dataset, we calculated a sensitivity of 0.94 and a specificity of 0.76. The implementation of the algorithm presented challenges due to the variability in coding practices between medical centers. To enhance performance, we refined the algorithm's data dictionary by incorporating additional codes. The process highlighted the need for meticulous code verification and standardization in multi-center studies.Sharing case-control algorithms boosts EHR research. Our rule-based algorithm improved multi-site patient identification and revealed 12 data entry errors, helping validate our results.","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":" ","pages":"314-326"},"PeriodicalIF":2.1000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11945218/pdf/","citationCount":"0","resultStr":"{\"title\":\"Application of an Externally Developed Algorithm to Identify Research Cases and Controls from EHR Data: Trials and Triumphs.\",\"authors\":\"Nelly Estefanie Garduno-Rapp, Simone Herzberg, Henry H Ong, Cindy Kao, Christoph U Lehmann, Srushti Gangireddy, Nitin B Jain, Ayush Giri\",\"doi\":\"10.1055/a-2524-5216\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The use of electronic health records (EHRs) in research demands robust and interoperable systems. By linking biorepositories to EHR algorithms, researchers can efficiently identify cases and controls for large observational studies (e.g., genome-wide association studies). This is critical for ensuring efficient and cost-effective research. However, the lack of standardized metadata and algorithms across different EHRs complicates their sharing and application. Our study presents an example of a successful implementation and validation process.This study aimed to implement and validate a rule-based algorithm from a tertiary medical center in Tennessee to classify cases and controls from a research study on rotator cuff tear (RCT) nested within a tertiary medical center in North Texas and to assess the algorithm's performance.We applied a phenotypic algorithm (designed and validated in a tertiary medical center in Tennessee) using EHR data from 492 patients enrolled in a case-control study recruited from a tertiary medical center in North Texas. The algorithm leveraged the international classification of diseases and current procedural terminology codes to identify case and control status for degenerative RCT. A manual review was conducted to compare the algorithm's classification with a previously recorded gold standard documented by clinical researchers.Initially the algorithm identified 398 (80.9%) patients correctly as cases or controls. After fine-tuning and correcting errors in our gold standard dataset, we calculated a sensitivity of 0.94 and a specificity of 0.76. The implementation of the algorithm presented challenges due to the variability in coding practices between medical centers. To enhance performance, we refined the algorithm's data dictionary by incorporating additional codes. The process highlighted the need for meticulous code verification and standardization in multi-center studies.Sharing case-control algorithms boosts EHR research. Our rule-based algorithm improved multi-site patient identification and revealed 12 data entry errors, helping validate our results.\",\"PeriodicalId\":48956,\"journal\":{\"name\":\"Applied Clinical Informatics\",\"volume\":\" \",\"pages\":\"314-326\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11945218/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Clinical Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1055/a-2524-5216\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Clinical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/a-2524-5216","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/24 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

摘要

背景：在研究中使用电子健康记录（EHR）需要强大、可互操作的系统。通过将生物库与电子病历算法连接起来，研究人员可以有效地确定大型观察性研究（如全基因组关联研究（GWAS））的病例和对照。这对于确保研究的效率和成本效益至关重要。然而，不同电子病历之间缺乏标准化的元数据和算法，这使得它们的共享和应用变得更加复杂。我们的研究提供了一个成功实施和验证过程的实例：实施并验证田纳西州一家三级医疗中心的基于规则的算法，对北德克萨斯州一家三级医疗中心的肩袖撕裂研究中的病例和对照进行分类，并评估该算法的性能：我们利用从北德克萨斯州一家三级医疗中心招募的 492 名病例对照研究入组患者的电子病历数据，应用了一种表型算法（在田纳西州一家三级医疗中心设计并验证）。该算法利用 ICD（国际疾病分类）和 CPT（现行程序术语）代码来识别退行性肩袖撕裂的病例和对照状态。为了将该算法的分类与临床研究人员之前记录的黄金标准进行比较，还进行了人工审核：结果：最初，该算法将 398 名（80.9%）患者正确识别为病例或对照组。在对金标准数据集进行微调和纠错后，我们计算出灵敏度为 0.94，特异度为 0.76：由于不同医疗中心的编码实践存在差异，该算法的实施面临挑战。为了提高算法的性能，我们改进了算法的数据字典，加入了更多的代码。这一过程凸显了在多中心研究中进行细致编码验证和标准化的必要性：结论：共享病例对照算法可促进电子病历研究。我们基于规则的算法改进了多中心患者的识别，并发现了 12 个数据录入错误，有助于验证我们的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Application of an Externally Developed Algorithm to Identify Research Cases and Controls from EHR Data: Trials and Triumphs.

The use of electronic health records (EHRs) in research demands robust and interoperable systems. By linking biorepositories to EHR algorithms, researchers can efficiently identify cases and controls for large observational studies (e.g., genome-wide association studies). This is critical for ensuring efficient and cost-effective research. However, the lack of standardized metadata and algorithms across different EHRs complicates their sharing and application. Our study presents an example of a successful implementation and validation process.This study aimed to implement and validate a rule-based algorithm from a tertiary medical center in Tennessee to classify cases and controls from a research study on rotator cuff tear (RCT) nested within a tertiary medical center in North Texas and to assess the algorithm's performance.We applied a phenotypic algorithm (designed and validated in a tertiary medical center in Tennessee) using EHR data from 492 patients enrolled in a case-control study recruited from a tertiary medical center in North Texas. The algorithm leveraged the international classification of diseases and current procedural terminology codes to identify case and control status for degenerative RCT. A manual review was conducted to compare the algorithm's classification with a previously recorded gold standard documented by clinical researchers.Initially the algorithm identified 398 (80.9%) patients correctly as cases or controls. After fine-tuning and correcting errors in our gold standard dataset, we calculated a sensitivity of 0.94 and a specificity of 0.76. The implementation of the algorithm presented challenges due to the variability in coding practices between medical centers. To enhance performance, we refined the algorithm's data dictionary by incorporating additional codes. The process highlighted the need for meticulous code verification and standardization in multi-center studies.Sharing case-control algorithms boosts EHR research. Our rule-based algorithm improved multi-site patient identification and revealed 12 data entry errors, helping validate our results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Clinical Informatics MEDICAL INFORMATICS-

CiteScore

4.60

自引率

24.10%

发文量

132

期刊介绍： ACI is the third Schattauer journal dealing with biomedical and health informatics. It perfectly complements our other journals Öffnet internen Link im aktuellen FensterMethods of Information in Medicine and the Öffnet internen Link im aktuellen FensterYearbook of Medical Informatics. The Yearbook of Medical Informatics being the “Milestone” or state-of-the-art journal and Methods of Information in Medicine being the “Science and Research” journal of IMIA, ACI intends to be the “Practical” journal of IMIA.