Predicting type 1 diabetes in children using electronic health records in primary care in the UK: development and validation of a machine-learning algorithm
Prof Rhian Daniel PhD , Hywel Jones PGDip , Prof John W Gregory MD , Ambika Shetty MD , Prof Nick Francis PhD , Prof Shantini Paranjothy PhD , Julia Townson PhD
{"title":"Predicting type 1 diabetes in children using electronic health records in primary care in the UK: development and validation of a machine-learning algorithm","authors":"Prof Rhian Daniel PhD , Hywel Jones PGDip , Prof John W Gregory MD , Ambika Shetty MD , Prof Nick Francis PhD , Prof Shantini Paranjothy PhD , Julia Townson PhD","doi":"10.1016/S2589-7500(24)00050-5","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Children presenting to primary care with suspected type 1 diabetes should be referred immediately to secondary care to avoid life-threatening diabetic ketoacidosis. However, early recognition of children with type 1 diabetes is challenging. Children might not present with classic symptoms, or symptoms might be attributed to more common conditions. A quarter of children present with diabetic ketoacidosis, a proportion unchanged over 25 years. Our aim was to investigate whether a machine-learning algorithm could lead to earlier detection of type 1 diabetes in primary care.</p></div><div><h3>Methods</h3><p>We developed the predictive algorithm using Welsh primary care electronic health records (EHRs) linked to the Brecon Dataset, a register of children newly diagnosed with type 1 diabetes. Children were included from their first primary care record within the study period of Jan 1, 2000, to Dec 31, 2016, until either type 1 diabetes diagnosis, they turned 15 years of age, or study end. We developed an ensemble learner (SuperLearner) using 26 potential predictors. Validation of the algorithm was done in English EHRs from the Clinical Practice Research Datalink (primary care) and Hospital Episode Statistics, focusing on the ability of the algorithm to identify children who went on to develop type 1 diabetes and the time by which diagnosis could be anticipated.</p></div><div><h3>Findings</h3><p>The development dataset comprised 34 754 400 primary care contacts, relating to 952 402 children, and the validation dataset comprised 43 089 103 primary care contacts, relating to 1 493 328 children. Of these, 1829 (0·19%) children younger than 15 years in the development dataset, and 1516 (0·10%) in the validation dataset had a reliable date of type 1 diabetes diagnosis. If set to give an alert in 10% of contacts, an estimated 71·6% (95% CI 68·8–74·4) of the children with type 1 diabetes would receive an alert by the algorithm in the 90 days before diagnosis, with diagnosis anticipated, on average, by an estimated 9·34 days (95% CI 7·77–10·9).</p></div><div><h3>Interpretation</h3><p>If implemented into primary care settings, this predictive algorithm could substantially reduce the proportion of patients with new-onset type 1 diabetes presenting in diabetic ketoacidosis. Acceptability of alert thresholds should be explored in primary care.</p></div><div><h3>Funding</h3><p>Diabetes UK.</p></div>","PeriodicalId":48534,"journal":{"name":"Lancet Digital Health","volume":null,"pages":null},"PeriodicalIF":23.8000,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589750024000505/pdfft?md5=76310061ae70a1b70541648aac85e4dd&pid=1-s2.0-S2589750024000505-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lancet Digital Health","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589750024000505","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Children presenting to primary care with suspected type 1 diabetes should be referred immediately to secondary care to avoid life-threatening diabetic ketoacidosis. However, early recognition of children with type 1 diabetes is challenging. Children might not present with classic symptoms, or symptoms might be attributed to more common conditions. A quarter of children present with diabetic ketoacidosis, a proportion unchanged over 25 years. Our aim was to investigate whether a machine-learning algorithm could lead to earlier detection of type 1 diabetes in primary care.
Methods
We developed the predictive algorithm using Welsh primary care electronic health records (EHRs) linked to the Brecon Dataset, a register of children newly diagnosed with type 1 diabetes. Children were included from their first primary care record within the study period of Jan 1, 2000, to Dec 31, 2016, until either type 1 diabetes diagnosis, they turned 15 years of age, or study end. We developed an ensemble learner (SuperLearner) using 26 potential predictors. Validation of the algorithm was done in English EHRs from the Clinical Practice Research Datalink (primary care) and Hospital Episode Statistics, focusing on the ability of the algorithm to identify children who went on to develop type 1 diabetes and the time by which diagnosis could be anticipated.
Findings
The development dataset comprised 34 754 400 primary care contacts, relating to 952 402 children, and the validation dataset comprised 43 089 103 primary care contacts, relating to 1 493 328 children. Of these, 1829 (0·19%) children younger than 15 years in the development dataset, and 1516 (0·10%) in the validation dataset had a reliable date of type 1 diabetes diagnosis. If set to give an alert in 10% of contacts, an estimated 71·6% (95% CI 68·8–74·4) of the children with type 1 diabetes would receive an alert by the algorithm in the 90 days before diagnosis, with diagnosis anticipated, on average, by an estimated 9·34 days (95% CI 7·77–10·9).
Interpretation
If implemented into primary care settings, this predictive algorithm could substantially reduce the proportion of patients with new-onset type 1 diabetes presenting in diabetic ketoacidosis. Acceptability of alert thresholds should be explored in primary care.
期刊介绍:
The Lancet Digital Health publishes important, innovative, and practice-changing research on any topic connected with digital technology in clinical medicine, public health, and global health.
The journal’s open access content crosses subject boundaries, building bridges between health professionals and researchers.By bringing together the most important advances in this multidisciplinary field,The Lancet Digital Health is the most prominent publishing venue in digital health.
We publish a range of content types including Articles,Review, Comment, and Correspondence, contributing to promoting digital technologies in health practice worldwide.