Fatemeh Masaebi, Mehdi Azizmohammad Looha, Morteza Mohammadzadeh, Vida Pahlevani, Mojtaba Farjam, Farid Zayeri, Reza Homayounfar
{"title":"利用实验室指标和身体成分指标预测代谢功能障碍相关性脂肪肝的机器学习应用。","authors":"Fatemeh Masaebi, Mehdi Azizmohammad Looha, Morteza Mohammadzadeh, Vida Pahlevani, Mojtaba Farjam, Farid Zayeri, Reza Homayounfar","doi":"10.34172/aim.31269","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Metabolic dysfunction-associated steatotic liver disease (MASLD) represents a significant global health burden without established curative therapies. Early detection and preventive strategies are crucial for effective MASLD management. This study aimed to develop and validate machine-learning (ML) algorithms for accurate MASLD screening in a geographically diverse, large-scale population.</p><p><strong>Methods: </strong>Data from the prospective Fasa Cohort Study, initiated in rural Fars province, Iran (March 2014), were employed for this purpose. The required data were collected using blood tests, questionnaires, liver ultrasonography, and physical examinations. A two-step approach identified key predictors from over 100 variables: (1) statistical selection using mean decrease Gini in random forest and (2) incorporation of clinical expertise for alignment with known MASLD risk factors. The hold-out validation approach (with a 70/30 train/validation split) was utilized, along with 5-fold cross-validation on the validation set. Logistic regression, Naïve Bayes, support vector machine, and light gradient-boosting machine (LightGBM) algorithms were compared for model construction with the same input variables based on area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy.</p><p><strong>Results: </strong>A total of 6,180 adults (52.7% female) were included in the study, categorized into 4816 non-MASLD and 1364 MASLD cases with a mean age (±standard deviation [SD]) of 48.12 (±9.61) and 49.47 (±9.15) years, respectively. Logistic regression outperformed other ML algorithms, achieving an accuracy of 0.88 (95% confidence interval [CI]: 0.86-0.89) and an AUC of 0.92 (95% CI: 0.90-0.93). Among more than 100 variables, the key predictors included waist circumference, body mass index (BMI), hip circumference, wrist circumference, alanine aminotransferase levels, cholesterol, glucose, high-density lipoprotein, and blood pressure.</p><p><strong>Conclusion: </strong>Integration of ML in MASLD management holds significant promise, particularly in resource-limited rural settings. Additionally, the relative importance assigned to each predictor, particularly prominent contributors such as waist circumference and BMI, offers valuable insights into MASLD prevention, diagnosis, and treatment strategies.</p>","PeriodicalId":55469,"journal":{"name":"Archives of Iranian Medicine","volume":"27 10","pages":"551-562"},"PeriodicalIF":1.0000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11532655/pdf/","citationCount":"0","resultStr":"{\"title\":\"Machine-Learning Application for Predicting Metabolic Dysfunction-Associated Steatotic Liver Disease Using Laboratory and Body Composition Indicators.\",\"authors\":\"Fatemeh Masaebi, Mehdi Azizmohammad Looha, Morteza Mohammadzadeh, Vida Pahlevani, Mojtaba Farjam, Farid Zayeri, Reza Homayounfar\",\"doi\":\"10.34172/aim.31269\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Metabolic dysfunction-associated steatotic liver disease (MASLD) represents a significant global health burden without established curative therapies. Early detection and preventive strategies are crucial for effective MASLD management. This study aimed to develop and validate machine-learning (ML) algorithms for accurate MASLD screening in a geographically diverse, large-scale population.</p><p><strong>Methods: </strong>Data from the prospective Fasa Cohort Study, initiated in rural Fars province, Iran (March 2014), were employed for this purpose. The required data were collected using blood tests, questionnaires, liver ultrasonography, and physical examinations. A two-step approach identified key predictors from over 100 variables: (1) statistical selection using mean decrease Gini in random forest and (2) incorporation of clinical expertise for alignment with known MASLD risk factors. The hold-out validation approach (with a 70/30 train/validation split) was utilized, along with 5-fold cross-validation on the validation set. Logistic regression, Naïve Bayes, support vector machine, and light gradient-boosting machine (LightGBM) algorithms were compared for model construction with the same input variables based on area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy.</p><p><strong>Results: </strong>A total of 6,180 adults (52.7% female) were included in the study, categorized into 4816 non-MASLD and 1364 MASLD cases with a mean age (±standard deviation [SD]) of 48.12 (±9.61) and 49.47 (±9.15) years, respectively. Logistic regression outperformed other ML algorithms, achieving an accuracy of 0.88 (95% confidence interval [CI]: 0.86-0.89) and an AUC of 0.92 (95% CI: 0.90-0.93). Among more than 100 variables, the key predictors included waist circumference, body mass index (BMI), hip circumference, wrist circumference, alanine aminotransferase levels, cholesterol, glucose, high-density lipoprotein, and blood pressure.</p><p><strong>Conclusion: </strong>Integration of ML in MASLD management holds significant promise, particularly in resource-limited rural settings. Additionally, the relative importance assigned to each predictor, particularly prominent contributors such as waist circumference and BMI, offers valuable insights into MASLD prevention, diagnosis, and treatment strategies.</p>\",\"PeriodicalId\":55469,\"journal\":{\"name\":\"Archives of Iranian Medicine\",\"volume\":\"27 10\",\"pages\":\"551-562\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11532655/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Archives of Iranian Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.34172/aim.31269\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of Iranian Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.34172/aim.31269","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
Machine-Learning Application for Predicting Metabolic Dysfunction-Associated Steatotic Liver Disease Using Laboratory and Body Composition Indicators.
Background: Metabolic dysfunction-associated steatotic liver disease (MASLD) represents a significant global health burden without established curative therapies. Early detection and preventive strategies are crucial for effective MASLD management. This study aimed to develop and validate machine-learning (ML) algorithms for accurate MASLD screening in a geographically diverse, large-scale population.
Methods: Data from the prospective Fasa Cohort Study, initiated in rural Fars province, Iran (March 2014), were employed for this purpose. The required data were collected using blood tests, questionnaires, liver ultrasonography, and physical examinations. A two-step approach identified key predictors from over 100 variables: (1) statistical selection using mean decrease Gini in random forest and (2) incorporation of clinical expertise for alignment with known MASLD risk factors. The hold-out validation approach (with a 70/30 train/validation split) was utilized, along with 5-fold cross-validation on the validation set. Logistic regression, Naïve Bayes, support vector machine, and light gradient-boosting machine (LightGBM) algorithms were compared for model construction with the same input variables based on area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy.
Results: A total of 6,180 adults (52.7% female) were included in the study, categorized into 4816 non-MASLD and 1364 MASLD cases with a mean age (±standard deviation [SD]) of 48.12 (±9.61) and 49.47 (±9.15) years, respectively. Logistic regression outperformed other ML algorithms, achieving an accuracy of 0.88 (95% confidence interval [CI]: 0.86-0.89) and an AUC of 0.92 (95% CI: 0.90-0.93). Among more than 100 variables, the key predictors included waist circumference, body mass index (BMI), hip circumference, wrist circumference, alanine aminotransferase levels, cholesterol, glucose, high-density lipoprotein, and blood pressure.
Conclusion: Integration of ML in MASLD management holds significant promise, particularly in resource-limited rural settings. Additionally, the relative importance assigned to each predictor, particularly prominent contributors such as waist circumference and BMI, offers valuable insights into MASLD prevention, diagnosis, and treatment strategies.
期刊介绍:
Aim and Scope: The Archives of Iranian Medicine (AIM) is a monthly peer-reviewed multidisciplinary medical publication. The journal welcomes contributions particularly relevant to the Middle-East region and publishes biomedical experiences and clinical investigations on prevalent diseases in the region as well as analyses of factors that may modulate the incidence, course, and management of diseases and pertinent medical problems. Manuscripts with didactic orientation and subjects exclusively of local interest will not be considered for publication.The 2016 Impact Factor of "Archives of Iranian Medicine" is 1.20.