{"title":"A prediction model for genetic cholestatic disease in infancy using the machine learning approach.","authors":"Chi-San Tai, Sung-Chu Ko, Chien-Chang Lee, Hui-Ru Yang, Chia-Ray Lin, Byung-Ho Choe, Suporn Treepongkaruna, Voranush Chongsrisawat, Chau-Chung Wu, Huey-Ling Chen","doi":"10.1002/jpn3.70166","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Cholestasis in infancy poses a complex clinical conundrum for pediatric hepatologists, warranting timely diagnosis, especially for genetic diseases. This study aims to create machine learning (ML)-based prediction models, referred to as Jaundice Diagnosis Easy for Baby (JADE-B), to identify the subjects prone to genetic causes of cholestasis.</p><p><strong>Methods: </strong>We retrieved patient data from the Integrated Medical Database at a university-affiliated tertiary medical center from 2006 to 2018. Patients with cholestatic disease were identified using liver-disease-specific International Classification of Diseases codes. A total of 47 clinical and laboratory parameters were used for ML for predicting a positive genetic disease, defined by a disease-specific genetic diagnosis matched with phenotype. Four distinct classifiers: Logistic regression, XGBoost (XGB), LightGBM (LGBM), and Random Forests were utilized to build the models.</p><p><strong>Results: </strong>From a patient pool of 1845, 1008 infants below 1 year of age diagnosed with cholestatic liver disease were included in the analysis. A comprehensive set of 47 pertinent clinical and laboratory features was incorporated for training the ML models. We built five sets of models (Model 1-5), yielding an area under the receiver operating characteristic curve of 0.869, 0.884, 0.855, 0.852, and 0.836, respectively. A JADE-B model was built using 20 simple and widely accessible clinical parameters at disease onset, up to 1 month, to predict patients with genetic disorders.</p><p><strong>Conclusions: </strong>The machine learning model prioritizes cholestatic infants for the allocation of genetic diagnostic tools and patient referrals, as well as optimizes the utilization of genetic diagnostic resources.</p>","PeriodicalId":16694,"journal":{"name":"Journal of Pediatric Gastroenterology and Nutrition","volume":" ","pages":"933-942"},"PeriodicalIF":2.6000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12484703/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pediatric Gastroenterology and Nutrition","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/jpn3.70166","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/30 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Cholestasis in infancy poses a complex clinical conundrum for pediatric hepatologists, warranting timely diagnosis, especially for genetic diseases. This study aims to create machine learning (ML)-based prediction models, referred to as Jaundice Diagnosis Easy for Baby (JADE-B), to identify the subjects prone to genetic causes of cholestasis.
Methods: We retrieved patient data from the Integrated Medical Database at a university-affiliated tertiary medical center from 2006 to 2018. Patients with cholestatic disease were identified using liver-disease-specific International Classification of Diseases codes. A total of 47 clinical and laboratory parameters were used for ML for predicting a positive genetic disease, defined by a disease-specific genetic diagnosis matched with phenotype. Four distinct classifiers: Logistic regression, XGBoost (XGB), LightGBM (LGBM), and Random Forests were utilized to build the models.
Results: From a patient pool of 1845, 1008 infants below 1 year of age diagnosed with cholestatic liver disease were included in the analysis. A comprehensive set of 47 pertinent clinical and laboratory features was incorporated for training the ML models. We built five sets of models (Model 1-5), yielding an area under the receiver operating characteristic curve of 0.869, 0.884, 0.855, 0.852, and 0.836, respectively. A JADE-B model was built using 20 simple and widely accessible clinical parameters at disease onset, up to 1 month, to predict patients with genetic disorders.
Conclusions: The machine learning model prioritizes cholestatic infants for the allocation of genetic diagnostic tools and patient referrals, as well as optimizes the utilization of genetic diagnostic resources.
期刊介绍:
The Journal of Pediatric Gastroenterology and Nutrition (JPGN) provides a forum for original papers and reviews dealing with pediatric gastroenterology and nutrition, including normal and abnormal functions of the alimentary tract and its associated organs, including the salivary glands, pancreas, gallbladder, and liver. Particular emphasis is on development and its relation to infant and childhood nutrition.