Mohammad Saheb-Honar, M. Gholampour Dehaki, M. H. Kazemi-Galougahi, Saeed Soleiman-Meigooni
{"title":"Logistic回归、决策树和随机森林三种研究方法在军队人群中2型糖尿病危险因素和分类研究中的比较","authors":"Mohammad Saheb-Honar, M. Gholampour Dehaki, M. H. Kazemi-Galougahi, Saeed Soleiman-Meigooni","doi":"10.5812/jamm-118525","DOIUrl":null,"url":null,"abstract":"Background: Type 2 diabetes mellitus (T2DM) is one of the major non-communicable diseases, causing morbidity and mortality worldwide. There is no study on T2DM status in Iran Army Forces. Objectives: We aimed to measure the prevalence of T2DM in this population and identify variables associated with T2DM risk in order to classify individuals. Methods: Data from 3661 Iran Army Ground Forces were employed. Characteristics of the subjects with and without T2DM were compared. We examined the classification ability of logistic regression with two tree-based supervised learning algorithms, decision tree and random forest (RF). The ethical committee of AJA University of Medical Sciences approved this study by the approval code 995685. Results: The prevalence of T2DM was 3% less than in the general population. Our results showed that the incidence of T2DM increases as subjects become older. The proportions of staff members with T2DM were more than the other military ranks. T2DM is more common in obese and overweight groups. The highest prevalence of T2DM is in the subjects with high levels of lipid profile. The areas below the receiver operating characteristic curve for logistic regression, decision tree, and RF were 73.8%, 77.1%, and 97.1%, respectively. Conclusions: Age, body mass index, total cholesterol, low-density lipoprotein cholesterol, and triglyceride are associated with T2DM risk. The RF has superior classification performance in comparison with logistic regression and decision tree.","PeriodicalId":15058,"journal":{"name":"Journal of Archives in Military Medicine","volume":"22 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Comparison of Three Research Methods: Logistic Regression, Decision Tree, and Random Forest to Reveal Association of Type 2 Diabetes with Risk Factors and Classify Subjects in a Military Population\",\"authors\":\"Mohammad Saheb-Honar, M. Gholampour Dehaki, M. H. Kazemi-Galougahi, Saeed Soleiman-Meigooni\",\"doi\":\"10.5812/jamm-118525\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Type 2 diabetes mellitus (T2DM) is one of the major non-communicable diseases, causing morbidity and mortality worldwide. There is no study on T2DM status in Iran Army Forces. Objectives: We aimed to measure the prevalence of T2DM in this population and identify variables associated with T2DM risk in order to classify individuals. Methods: Data from 3661 Iran Army Ground Forces were employed. Characteristics of the subjects with and without T2DM were compared. We examined the classification ability of logistic regression with two tree-based supervised learning algorithms, decision tree and random forest (RF). The ethical committee of AJA University of Medical Sciences approved this study by the approval code 995685. Results: The prevalence of T2DM was 3% less than in the general population. Our results showed that the incidence of T2DM increases as subjects become older. The proportions of staff members with T2DM were more than the other military ranks. T2DM is more common in obese and overweight groups. The highest prevalence of T2DM is in the subjects with high levels of lipid profile. The areas below the receiver operating characteristic curve for logistic regression, decision tree, and RF were 73.8%, 77.1%, and 97.1%, respectively. Conclusions: Age, body mass index, total cholesterol, low-density lipoprotein cholesterol, and triglyceride are associated with T2DM risk. The RF has superior classification performance in comparison with logistic regression and decision tree.\",\"PeriodicalId\":15058,\"journal\":{\"name\":\"Journal of Archives in Military Medicine\",\"volume\":\"22 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Archives in Military Medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5812/jamm-118525\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Archives in Military Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5812/jamm-118525","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Comparison of Three Research Methods: Logistic Regression, Decision Tree, and Random Forest to Reveal Association of Type 2 Diabetes with Risk Factors and Classify Subjects in a Military Population
Background: Type 2 diabetes mellitus (T2DM) is one of the major non-communicable diseases, causing morbidity and mortality worldwide. There is no study on T2DM status in Iran Army Forces. Objectives: We aimed to measure the prevalence of T2DM in this population and identify variables associated with T2DM risk in order to classify individuals. Methods: Data from 3661 Iran Army Ground Forces were employed. Characteristics of the subjects with and without T2DM were compared. We examined the classification ability of logistic regression with two tree-based supervised learning algorithms, decision tree and random forest (RF). The ethical committee of AJA University of Medical Sciences approved this study by the approval code 995685. Results: The prevalence of T2DM was 3% less than in the general population. Our results showed that the incidence of T2DM increases as subjects become older. The proportions of staff members with T2DM were more than the other military ranks. T2DM is more common in obese and overweight groups. The highest prevalence of T2DM is in the subjects with high levels of lipid profile. The areas below the receiver operating characteristic curve for logistic regression, decision tree, and RF were 73.8%, 77.1%, and 97.1%, respectively. Conclusions: Age, body mass index, total cholesterol, low-density lipoprotein cholesterol, and triglyceride are associated with T2DM risk. The RF has superior classification performance in comparison with logistic regression and decision tree.