{"title":"利用各种机器学习方法中的关键因素建立糖尿病发展预测模型","authors":"Marenao Tanaka , Yukinori Akiyama , Kazuma Mori , Itaru Hosaka , Kenichi Kato , Keisuke Endo , Toshifumi Ogawa , Tatsuya Sato , Toru Suzuki , Toshiyuki Yano , Hirofumi Ohnishi , Nagisa Hanawa , Masato Furuhashi","doi":"10.1016/j.deman.2023.100191","DOIUrl":null,"url":null,"abstract":"<div><h3>Aims</h3><p>Machine learning (ML) approaches are beneficial when automatic identification of relevant features among numerous candidates is desired. We investigated the predictive ability of several ML models for new onset of diabetes mellitus.</p></div><div><h3>Methods</h3><p>In 10,248 subjects who received annual health examinations, 58 candidates including fatty liver index (FLI), which is calculated by using waist circumference, body mass index and levels of triglycerides and γ-glutamyl transferase, were used.</p></div><div><h3>Results</h3><p>During a 10-year follow-up period (mean period: 6.9 years), 322 subjects (6.5 %) in the training group (70 %, n=7,173) and 127 subjects (6.2 %) in the test group (30 %, n=3,075) had new onset of diabetes mellitus. Hemoglobin A1c, fasting glucose and FLI were identified as the top 3 predictors by random forest feature selection with 10-fold cross-validation. When hemoglobin A1c and FLI were used as the selected features, C-statistics analogous in receiver operating characteristic curve analysis in ML models including logistic regression, naïve Bayes, extreme gradient boosting and artificial neural network were 0.874, 0.869, 0.856 and 0.869, respectively. There was no significant difference in the discriminatory capacity among the ML models.</p></div><div><h3>Conclusions</h3><p>ML models incorporating hemoglobin A1c and FLI provide an accurate and straightforward approach for predicting the development of diabetes mellitus.</p></div>","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666970623000707/pdfft?md5=29183cb351f691865659fdb42480574b&pid=1-s2.0-S2666970623000707-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Predictive modeling for the development of diabetes mellitus using key factors in various machine learning approaches\",\"authors\":\"Marenao Tanaka , Yukinori Akiyama , Kazuma Mori , Itaru Hosaka , Kenichi Kato , Keisuke Endo , Toshifumi Ogawa , Tatsuya Sato , Toru Suzuki , Toshiyuki Yano , Hirofumi Ohnishi , Nagisa Hanawa , Masato Furuhashi\",\"doi\":\"10.1016/j.deman.2023.100191\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Aims</h3><p>Machine learning (ML) approaches are beneficial when automatic identification of relevant features among numerous candidates is desired. We investigated the predictive ability of several ML models for new onset of diabetes mellitus.</p></div><div><h3>Methods</h3><p>In 10,248 subjects who received annual health examinations, 58 candidates including fatty liver index (FLI), which is calculated by using waist circumference, body mass index and levels of triglycerides and γ-glutamyl transferase, were used.</p></div><div><h3>Results</h3><p>During a 10-year follow-up period (mean period: 6.9 years), 322 subjects (6.5 %) in the training group (70 %, n=7,173) and 127 subjects (6.2 %) in the test group (30 %, n=3,075) had new onset of diabetes mellitus. Hemoglobin A1c, fasting glucose and FLI were identified as the top 3 predictors by random forest feature selection with 10-fold cross-validation. When hemoglobin A1c and FLI were used as the selected features, C-statistics analogous in receiver operating characteristic curve analysis in ML models including logistic regression, naïve Bayes, extreme gradient boosting and artificial neural network were 0.874, 0.869, 0.856 and 0.869, respectively. There was no significant difference in the discriminatory capacity among the ML models.</p></div><div><h3>Conclusions</h3><p>ML models incorporating hemoglobin A1c and FLI provide an accurate and straightforward approach for predicting the development of diabetes mellitus.</p></div>\",\"PeriodicalId\":1,\"journal\":{\"name\":\"Accounts of Chemical Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":16.4000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666970623000707/pdfft?md5=29183cb351f691865659fdb42480574b&pid=1-s2.0-S2666970623000707-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accounts of Chemical Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666970623000707\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666970623000707","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Predictive modeling for the development of diabetes mellitus using key factors in various machine learning approaches
Aims
Machine learning (ML) approaches are beneficial when automatic identification of relevant features among numerous candidates is desired. We investigated the predictive ability of several ML models for new onset of diabetes mellitus.
Methods
In 10,248 subjects who received annual health examinations, 58 candidates including fatty liver index (FLI), which is calculated by using waist circumference, body mass index and levels of triglycerides and γ-glutamyl transferase, were used.
Results
During a 10-year follow-up period (mean period: 6.9 years), 322 subjects (6.5 %) in the training group (70 %, n=7,173) and 127 subjects (6.2 %) in the test group (30 %, n=3,075) had new onset of diabetes mellitus. Hemoglobin A1c, fasting glucose and FLI were identified as the top 3 predictors by random forest feature selection with 10-fold cross-validation. When hemoglobin A1c and FLI were used as the selected features, C-statistics analogous in receiver operating characteristic curve analysis in ML models including logistic regression, naïve Bayes, extreme gradient boosting and artificial neural network were 0.874, 0.869, 0.856 and 0.869, respectively. There was no significant difference in the discriminatory capacity among the ML models.
Conclusions
ML models incorporating hemoglobin A1c and FLI provide an accurate and straightforward approach for predicting the development of diabetes mellitus.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.