Logistic回归、决策树和随机森林三种研究方法在军队人群中2型糖尿病危险因素和分类研究中的比较

Mohammad Saheb-Honar, M. Gholampour Dehaki, M. H. Kazemi-Galougahi, Saeed Soleiman-Meigooni
{"title":"Logistic回归、决策树和随机森林三种研究方法在军队人群中2型糖尿病危险因素和分类研究中的比较","authors":"Mohammad Saheb-Honar, M. Gholampour Dehaki, M. H. Kazemi-Galougahi, Saeed Soleiman-Meigooni","doi":"10.5812/jamm-118525","DOIUrl":null,"url":null,"abstract":"Background: Type 2 diabetes mellitus (T2DM) is one of the major non-communicable diseases, causing morbidity and mortality worldwide. There is no study on T2DM status in Iran Army Forces. Objectives: We aimed to measure the prevalence of T2DM in this population and identify variables associated with T2DM risk in order to classify individuals. Methods: Data from 3661 Iran Army Ground Forces were employed. Characteristics of the subjects with and without T2DM were compared. We examined the classification ability of logistic regression with two tree-based supervised learning algorithms, decision tree and random forest (RF). The ethical committee of AJA University of Medical Sciences approved this study by the approval code 995685. Results: The prevalence of T2DM was 3% less than in the general population. Our results showed that the incidence of T2DM increases as subjects become older. The proportions of staff members with T2DM were more than the other military ranks. T2DM is more common in obese and overweight groups. The highest prevalence of T2DM is in the subjects with high levels of lipid profile. The areas below the receiver operating characteristic curve for logistic regression, decision tree, and RF were 73.8%, 77.1%, and 97.1%, respectively. Conclusions: Age, body mass index, total cholesterol, low-density lipoprotein cholesterol, and triglyceride are associated with T2DM risk. The RF has superior classification performance in comparison with logistic regression and decision tree.","PeriodicalId":15058,"journal":{"name":"Journal of Archives in Military Medicine","volume":"22 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Comparison of Three Research Methods: Logistic Regression, Decision Tree, and Random Forest to Reveal Association of Type 2 Diabetes with Risk Factors and Classify Subjects in a Military Population\",\"authors\":\"Mohammad Saheb-Honar, M. Gholampour Dehaki, M. H. Kazemi-Galougahi, Saeed Soleiman-Meigooni\",\"doi\":\"10.5812/jamm-118525\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Type 2 diabetes mellitus (T2DM) is one of the major non-communicable diseases, causing morbidity and mortality worldwide. There is no study on T2DM status in Iran Army Forces. Objectives: We aimed to measure the prevalence of T2DM in this population and identify variables associated with T2DM risk in order to classify individuals. Methods: Data from 3661 Iran Army Ground Forces were employed. Characteristics of the subjects with and without T2DM were compared. We examined the classification ability of logistic regression with two tree-based supervised learning algorithms, decision tree and random forest (RF). The ethical committee of AJA University of Medical Sciences approved this study by the approval code 995685. Results: The prevalence of T2DM was 3% less than in the general population. Our results showed that the incidence of T2DM increases as subjects become older. The proportions of staff members with T2DM were more than the other military ranks. T2DM is more common in obese and overweight groups. The highest prevalence of T2DM is in the subjects with high levels of lipid profile. The areas below the receiver operating characteristic curve for logistic regression, decision tree, and RF were 73.8%, 77.1%, and 97.1%, respectively. Conclusions: Age, body mass index, total cholesterol, low-density lipoprotein cholesterol, and triglyceride are associated with T2DM risk. The RF has superior classification performance in comparison with logistic regression and decision tree.\",\"PeriodicalId\":15058,\"journal\":{\"name\":\"Journal of Archives in Military Medicine\",\"volume\":\"22 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Archives in Military Medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5812/jamm-118525\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Archives in Military Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5812/jamm-118525","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

背景:2型糖尿病(T2DM)是世界范围内发病率和死亡率最高的非传染性疾病之一。目前没有关于伊朗军队T2DM状况的研究。目的:我们旨在测量该人群中2型糖尿病的患病率,并确定与2型糖尿病风险相关的变量,以便对个体进行分类。方法:采用3661名伊朗陆军地面部队人员的数据。比较2型糖尿病患者和非2型糖尿病患者的特征。我们用两种基于树的监督学习算法,决策树和随机森林(RF)来检验逻辑回归的分类能力。AJA医学科学大学伦理委员会批准本研究,批准代码995685。结果:T2DM患病率比普通人群低3%。我们的研究结果显示,T2DM的发病率随着受试者年龄的增长而增加。工作人员患2型糖尿病的比例高于其他军衔。2型糖尿病在肥胖和超重人群中更为常见。2型糖尿病患病率最高的是血脂水平高的受试者。logistic回归、决策树和RF的受试者工作特征曲线以下面积分别为73.8%、77.1%和97.1%。结论:年龄、体重指数、总胆固醇、低密度脂蛋白胆固醇和甘油三酯与T2DM风险相关。与逻辑回归和决策树相比,该方法具有更好的分类性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Comparison of Three Research Methods: Logistic Regression, Decision Tree, and Random Forest to Reveal Association of Type 2 Diabetes with Risk Factors and Classify Subjects in a Military Population
Background: Type 2 diabetes mellitus (T2DM) is one of the major non-communicable diseases, causing morbidity and mortality worldwide. There is no study on T2DM status in Iran Army Forces. Objectives: We aimed to measure the prevalence of T2DM in this population and identify variables associated with T2DM risk in order to classify individuals. Methods: Data from 3661 Iran Army Ground Forces were employed. Characteristics of the subjects with and without T2DM were compared. We examined the classification ability of logistic regression with two tree-based supervised learning algorithms, decision tree and random forest (RF). The ethical committee of AJA University of Medical Sciences approved this study by the approval code 995685. Results: The prevalence of T2DM was 3% less than in the general population. Our results showed that the incidence of T2DM increases as subjects become older. The proportions of staff members with T2DM were more than the other military ranks. T2DM is more common in obese and overweight groups. The highest prevalence of T2DM is in the subjects with high levels of lipid profile. The areas below the receiver operating characteristic curve for logistic regression, decision tree, and RF were 73.8%, 77.1%, and 97.1%, respectively. Conclusions: Age, body mass index, total cholesterol, low-density lipoprotein cholesterol, and triglyceride are associated with T2DM risk. The RF has superior classification performance in comparison with logistic regression and decision tree.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信