{"title":"在具有全国代表性样本的晚期肝纤维化机器学习模型中,协方差重要性的可视化","authors":"Alexander A. Huang, Samuel Y. Huang","doi":"10.1002/jgh3.70200","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Introduction</h3>\n \n <p>Accurate prediction of liver disease is vital for early intervention, given its potential severity. This study aims to improve the prediction of advanced liver fibrosis and investigate its associations with factors, ultimately contributing to healthier lifestyle choices and timely management of liver disease.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>This cross-sectional study included adults from the US National Health and Nutrition Examination Survey (2017–2020). Questionnaires captured demographic, dietary, exercise, and mental health information. Advanced fibrosis was defined using liver stiffness measurement (LSM) with a 9.5 kPa threshold. XGBoost, a machine learning model, predicted fibrosis, assessed using AUROC. SHAP provided visual explanations of the model's predictions and feature contributions. Model gain, cover, and frequency measured feature importance, enabling transparent, and interpretable analysis.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>There were 6979 adults (age > 18) that were included in the study with an average age of 49.02 and 3523 (50%) female. The machine learning model had an area under the receiver operator curve of 0.885. The top eight covariates include waist circumference (gain = 0.185), GGT (gain = 0.101), platelet count (gain = 0.059), AST (gain = 0.057), weight (gain = 0.049), HDL-cholesterol (gain = 0.032), and ferritin (gain = 0.034).</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>In conclusion, the utilization of machine learning models proves to be highly effective in accurately predicting the risk of liver fibrosis. By considering various factors such as demographic information, laboratory results, physical examination findings, and lifestyle factors, these models successfully identify crucial risk factors associated with liver fibrosis.</p>\n </section>\n </div>","PeriodicalId":45861,"journal":{"name":"JGH Open","volume":"9 7","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jgh3.70200","citationCount":"0","resultStr":"{\"title\":\"The Visualization of the Importance of Covariance Importance in a Machine Learning Model for Advanced Liver Fibrosis in a Nationally Representative Sample\",\"authors\":\"Alexander A. Huang, Samuel Y. Huang\",\"doi\":\"10.1002/jgh3.70200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Introduction</h3>\\n \\n <p>Accurate prediction of liver disease is vital for early intervention, given its potential severity. This study aims to improve the prediction of advanced liver fibrosis and investigate its associations with factors, ultimately contributing to healthier lifestyle choices and timely management of liver disease.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>This cross-sectional study included adults from the US National Health and Nutrition Examination Survey (2017–2020). Questionnaires captured demographic, dietary, exercise, and mental health information. Advanced fibrosis was defined using liver stiffness measurement (LSM) with a 9.5 kPa threshold. XGBoost, a machine learning model, predicted fibrosis, assessed using AUROC. SHAP provided visual explanations of the model's predictions and feature contributions. Model gain, cover, and frequency measured feature importance, enabling transparent, and interpretable analysis.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>There were 6979 adults (age > 18) that were included in the study with an average age of 49.02 and 3523 (50%) female. The machine learning model had an area under the receiver operator curve of 0.885. The top eight covariates include waist circumference (gain = 0.185), GGT (gain = 0.101), platelet count (gain = 0.059), AST (gain = 0.057), weight (gain = 0.049), HDL-cholesterol (gain = 0.032), and ferritin (gain = 0.034).</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusion</h3>\\n \\n <p>In conclusion, the utilization of machine learning models proves to be highly effective in accurately predicting the risk of liver fibrosis. By considering various factors such as demographic information, laboratory results, physical examination findings, and lifestyle factors, these models successfully identify crucial risk factors associated with liver fibrosis.</p>\\n </section>\\n </div>\",\"PeriodicalId\":45861,\"journal\":{\"name\":\"JGH Open\",\"volume\":\"9 7\",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jgh3.70200\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JGH Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/jgh3.70200\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GASTROENTEROLOGY & HEPATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JGH Open","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jgh3.70200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
The Visualization of the Importance of Covariance Importance in a Machine Learning Model for Advanced Liver Fibrosis in a Nationally Representative Sample
Introduction
Accurate prediction of liver disease is vital for early intervention, given its potential severity. This study aims to improve the prediction of advanced liver fibrosis and investigate its associations with factors, ultimately contributing to healthier lifestyle choices and timely management of liver disease.
Methods
This cross-sectional study included adults from the US National Health and Nutrition Examination Survey (2017–2020). Questionnaires captured demographic, dietary, exercise, and mental health information. Advanced fibrosis was defined using liver stiffness measurement (LSM) with a 9.5 kPa threshold. XGBoost, a machine learning model, predicted fibrosis, assessed using AUROC. SHAP provided visual explanations of the model's predictions and feature contributions. Model gain, cover, and frequency measured feature importance, enabling transparent, and interpretable analysis.
Results
There were 6979 adults (age > 18) that were included in the study with an average age of 49.02 and 3523 (50%) female. The machine learning model had an area under the receiver operator curve of 0.885. The top eight covariates include waist circumference (gain = 0.185), GGT (gain = 0.101), platelet count (gain = 0.059), AST (gain = 0.057), weight (gain = 0.049), HDL-cholesterol (gain = 0.032), and ferritin (gain = 0.034).
Conclusion
In conclusion, the utilization of machine learning models proves to be highly effective in accurately predicting the risk of liver fibrosis. By considering various factors such as demographic information, laboratory results, physical examination findings, and lifestyle factors, these models successfully identify crucial risk factors associated with liver fibrosis.