Validated Models Using EHRs or Claims Data to Distinguish Diabetes Type among Adults

Advances in Diabetes & Endocrinology Pub Date : 1900-01-01 DOI:10.13188/2475-5591.1000018

JR Campione

{"title":"Validated Models Using EHRs or Claims Data to Distinguish Diabetes Type among Adults","authors":"JR Campione","doi":"10.13188/2475-5591.1000018","DOIUrl":null,"url":null,"abstract":"Purpose: Clinical data provides the opportunity for efficient and timely disease surveillance. We developed and validated advanced phenotyping models to classify adult patients with diabetes to type 1, type 2, or other/indeterminate using structured fields from EHR data. To simulate the use of claims data supplemented with medication information, we compared model performance before and after the removal of body mass index (BMI) and laboratory results. Methods: We used 3 years of EHR data from a sample of 2,465 adult patients with diabetes from a health care system’s clinical data warehouse. A weighted ratio of type 1 diabetes codes to all diabetes codes was created by down-weighting codes from care settings that do not treat diabetes. We developed two multinomial regression models and a machine learning conditional inference tree to classify patients to type 1, type 2, or other/indeterminate. The models were validated by calculating sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) relative to a gold standard. Results: For all models, the weighted ratio of type 1 diabetes was the strongest predictive factor. The models had validation statistics ≥ 93% for sensitivity; ≥ 87% for specificity; ≥ 88% for PPV, and ≥ 93% for NPV. After removal of BMI and laboratory data from the regression model the largest decline in performance from the full model was in type 2 diabetes specificity (90.8% to 89.2%). Conclusion: Prediction models and machine learning conditional inference trees using either structured fields from EHR data or claims data supplemented with medication data can be used to accurately distinguish diabetes type among adults. The inclusion of BMI and laboratory results improves model specificity for type 2","PeriodicalId":142531,"journal":{"name":"Advances in Diabetes & Endocrinology","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Diabetes & Endocrinology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13188/2475-5591.1000018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: Clinical data provides the opportunity for efficient and timely disease surveillance. We developed and validated advanced phenotyping models to classify adult patients with diabetes to type 1, type 2, or other/indeterminate using structured fields from EHR data. To simulate the use of claims data supplemented with medication information, we compared model performance before and after the removal of body mass index (BMI) and laboratory results. Methods: We used 3 years of EHR data from a sample of 2,465 adult patients with diabetes from a health care system’s clinical data warehouse. A weighted ratio of type 1 diabetes codes to all diabetes codes was created by down-weighting codes from care settings that do not treat diabetes. We developed two multinomial regression models and a machine learning conditional inference tree to classify patients to type 1, type 2, or other/indeterminate. The models were validated by calculating sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) relative to a gold standard. Results: For all models, the weighted ratio of type 1 diabetes was the strongest predictive factor. The models had validation statistics ≥ 93% for sensitivity; ≥ 87% for specificity; ≥ 88% for PPV, and ≥ 93% for NPV. After removal of BMI and laboratory data from the regression model the largest decline in performance from the full model was in type 2 diabetes specificity (90.8% to 89.2%). Conclusion: Prediction models and machine learning conditional inference trees using either structured fields from EHR data or claims data supplemented with medication data can be used to accurately distinguish diabetes type among adults. The inclusion of BMI and laboratory results improves model specificity for type 2

查看原文本刊更多论文

使用电子病历或索赔数据来区分成人糖尿病类型的验证模型

目的:临床数据为有效和及时的疾病监测提供机会。我们开发并验证了先进的表型模型，利用电子病历数据中的结构化字段将成年糖尿病患者分为1型、2型或其他/不确定型。为了模拟使用索赔数据补充药物信息，我们比较了去除身体质量指数(BMI)前后的模型性能和实验室结果。方法:我们使用了来自卫生保健系统临床数据仓库的2465名成年糖尿病患者3年的电子病历数据。1型糖尿病代码与所有糖尿病代码的加权比率是通过降低来自不治疗糖尿病的护理机构的代码的权重来创建的。我们开发了两个多项回归模型和一个机器学习条件推理树，将患者分为1型、2型或其他/不确定型。通过计算相对于金标准的敏感性、特异性、阳性预测值(PPV)和阴性预测值(NPV)对模型进行验证。结果:在所有模型中，1型糖尿病的加权比率是最强的预测因素。模型敏感性验证统计量≥93%;特异性≥87%;PPV≥88%，NPV≥93%。在从回归模型中去除BMI和实验室数据后，整个模型的性能下降幅度最大的是2型糖尿病特异性(90.8%至89.2%)。结论:预测模型和机器学习条件推理树可用于准确区分成人糖尿病类型，无论是使用EHR数据中的结构化字段，还是使用索赔数据补充药物数据。BMI和实验室结果的纳入提高了2型的模型特异性

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Advances in Diabetes & Endocrinology

自引率

0.00%

发文量