Mengxia Fu, Zhiming Peng, Xue Yu, Dapeng Lv, Min Wu
{"title":"开发一种可解释的机器学习模型,用于轻松检测乳腺癌幸存者中的胰岛素抵抗:一项横断面研究。","authors":"Mengxia Fu, Zhiming Peng, Xue Yu, Dapeng Lv, Min Wu","doi":"10.1186/s12911-025-03189-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To develop and validate a classification model for insulin resistance in female individuals who have survived breast cancer using easily obtainable clinical and demographic features.</p><p><strong>Methods: </strong>Data were obtained from the U.S. National Health and Nutrition Examination Survey (NHANES) spanning 1999 to March 2020. A total of 340 female individuals who have survived breast cancer were included, and participants were randomly assigned to a training set (n = 239) and a testing set (n = 101). Multiple machine learning algorithms were trained, including Logistic Regression, Random Forest, and Support Vector Machine. Model performance was evaluated using area under the receiver operating characteristic curve (AUC) and decision curve analysis (DCA).</p><p><strong>Results: </strong>All models demonstrated strong classification performance in the testing set, with AUC values exceeding 0.87. Among them, the Random Forest and Support Vector Machine models showed superior performance in DCA. Of the seven input features-body mass index, fasting blood glucose, triglyceride, HDL cholesterol, poverty income ratio, race, and education-fasting blood glucose had the highest positive feature importance for classifying insulin resistance.</p><p><strong>Conclusions: </strong>This study demonstrates the feasibility of using machine learning algorithms to accurately predict insulin resistance in individuals who have survived breast cancer with a limited set of clinical and demographic variables. The Random Forest and Support Vector Machine models, in particular, offer strong classification performance and may support clinicians in early identification and management of insulin resistance among individuals in this high-risk population.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"341"},"PeriodicalIF":3.8000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12465519/pdf/","citationCount":"0","resultStr":"{\"title\":\"Developing an interpretable machine learning model for easily detecting insulin resistance among breast cancer survivors: a cross-sectional study.\",\"authors\":\"Mengxia Fu, Zhiming Peng, Xue Yu, Dapeng Lv, Min Wu\",\"doi\":\"10.1186/s12911-025-03189-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>To develop and validate a classification model for insulin resistance in female individuals who have survived breast cancer using easily obtainable clinical and demographic features.</p><p><strong>Methods: </strong>Data were obtained from the U.S. National Health and Nutrition Examination Survey (NHANES) spanning 1999 to March 2020. A total of 340 female individuals who have survived breast cancer were included, and participants were randomly assigned to a training set (n = 239) and a testing set (n = 101). Multiple machine learning algorithms were trained, including Logistic Regression, Random Forest, and Support Vector Machine. Model performance was evaluated using area under the receiver operating characteristic curve (AUC) and decision curve analysis (DCA).</p><p><strong>Results: </strong>All models demonstrated strong classification performance in the testing set, with AUC values exceeding 0.87. Among them, the Random Forest and Support Vector Machine models showed superior performance in DCA. Of the seven input features-body mass index, fasting blood glucose, triglyceride, HDL cholesterol, poverty income ratio, race, and education-fasting blood glucose had the highest positive feature importance for classifying insulin resistance.</p><p><strong>Conclusions: </strong>This study demonstrates the feasibility of using machine learning algorithms to accurately predict insulin resistance in individuals who have survived breast cancer with a limited set of clinical and demographic variables. The Random Forest and Support Vector Machine models, in particular, offer strong classification performance and may support clinicians in early identification and management of insulin resistance among individuals in this high-risk population.</p>\",\"PeriodicalId\":9340,\"journal\":{\"name\":\"BMC Medical Informatics and Decision Making\",\"volume\":\"25 1\",\"pages\":\"341\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12465519/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Informatics and Decision Making\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-025-03189-z\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03189-z","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
Developing an interpretable machine learning model for easily detecting insulin resistance among breast cancer survivors: a cross-sectional study.
Objective: To develop and validate a classification model for insulin resistance in female individuals who have survived breast cancer using easily obtainable clinical and demographic features.
Methods: Data were obtained from the U.S. National Health and Nutrition Examination Survey (NHANES) spanning 1999 to March 2020. A total of 340 female individuals who have survived breast cancer were included, and participants were randomly assigned to a training set (n = 239) and a testing set (n = 101). Multiple machine learning algorithms were trained, including Logistic Regression, Random Forest, and Support Vector Machine. Model performance was evaluated using area under the receiver operating characteristic curve (AUC) and decision curve analysis (DCA).
Results: All models demonstrated strong classification performance in the testing set, with AUC values exceeding 0.87. Among them, the Random Forest and Support Vector Machine models showed superior performance in DCA. Of the seven input features-body mass index, fasting blood glucose, triglyceride, HDL cholesterol, poverty income ratio, race, and education-fasting blood glucose had the highest positive feature importance for classifying insulin resistance.
Conclusions: This study demonstrates the feasibility of using machine learning algorithms to accurately predict insulin resistance in individuals who have survived breast cancer with a limited set of clinical and demographic variables. The Random Forest and Support Vector Machine models, in particular, offer strong classification performance and may support clinicians in early identification and management of insulin resistance among individuals in this high-risk population.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.