Nathaniel A. Dell , Michael G. Vaughn , Sweta Prasad Srivastava , Abdulaziz Alsolami , Christopher P. Salas-Wright
{"title":"美国大麻使用障碍的相关因素:逻辑回归、分类树和随机森林的比较","authors":"Nathaniel A. Dell , Michael G. Vaughn , Sweta Prasad Srivastava , Abdulaziz Alsolami , Christopher P. Salas-Wright","doi":"10.1016/j.jpsychires.2022.05.021","DOIUrl":null,"url":null,"abstract":"<div><p><span>Although several recent studies have examined psychosocial and demographic correlates of cannabis use disorder (CUD) in adults, few, if any, recent studies have evaluated the performance of machine learning methods relative to standard </span>logistic regression<span> for identifying correlates of CUD. The present study used pooled data from the 2015–2018 National Survey on Drug Use and Health to evaluate psychosocial and demographic correlates of CUD in adults. In addition, we compared the performance of logistic regression, classification trees, and random forest methods in classifying CUD. When comparing the performance of each method on the test data set, classification trees (AUC = 0.84, 95%CI: 0.82, 0.85) and random forest (AUC = 0.83, 95%CI: 0.82, 0.85) performed similarly and superior to logistic regression (AUC = 0.77, 95%CI: 0.74, 0.79). Results of the random forests reveal that marital status, risk propensity, age, and cocaine dependence variables contributed most to node purity, whereas model accuracy would decrease significantly if county type, income, race, and education variables were excluded from the model. One possible approach to improving the efficiency, interpretability, and clinical insights of CUD correlates is the employment of machine learning techniques.</span></p></div>","PeriodicalId":16868,"journal":{"name":"Journal of psychiatric research","volume":"151 ","pages":"Pages 590-597"},"PeriodicalIF":3.7000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Correlates of cannabis use disorder in the United States: A comparison of logistic regression, classification trees, and random forests\",\"authors\":\"Nathaniel A. Dell , Michael G. Vaughn , Sweta Prasad Srivastava , Abdulaziz Alsolami , Christopher P. Salas-Wright\",\"doi\":\"10.1016/j.jpsychires.2022.05.021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p><span>Although several recent studies have examined psychosocial and demographic correlates of cannabis use disorder (CUD) in adults, few, if any, recent studies have evaluated the performance of machine learning methods relative to standard </span>logistic regression<span> for identifying correlates of CUD. The present study used pooled data from the 2015–2018 National Survey on Drug Use and Health to evaluate psychosocial and demographic correlates of CUD in adults. In addition, we compared the performance of logistic regression, classification trees, and random forest methods in classifying CUD. When comparing the performance of each method on the test data set, classification trees (AUC = 0.84, 95%CI: 0.82, 0.85) and random forest (AUC = 0.83, 95%CI: 0.82, 0.85) performed similarly and superior to logistic regression (AUC = 0.77, 95%CI: 0.74, 0.79). Results of the random forests reveal that marital status, risk propensity, age, and cocaine dependence variables contributed most to node purity, whereas model accuracy would decrease significantly if county type, income, race, and education variables were excluded from the model. One possible approach to improving the efficiency, interpretability, and clinical insights of CUD correlates is the employment of machine learning techniques.</span></p></div>\",\"PeriodicalId\":16868,\"journal\":{\"name\":\"Journal of psychiatric research\",\"volume\":\"151 \",\"pages\":\"Pages 590-597\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of psychiatric research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0022395622002746\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHIATRY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of psychiatric research","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022395622002746","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}
Correlates of cannabis use disorder in the United States: A comparison of logistic regression, classification trees, and random forests
Although several recent studies have examined psychosocial and demographic correlates of cannabis use disorder (CUD) in adults, few, if any, recent studies have evaluated the performance of machine learning methods relative to standard logistic regression for identifying correlates of CUD. The present study used pooled data from the 2015–2018 National Survey on Drug Use and Health to evaluate psychosocial and demographic correlates of CUD in adults. In addition, we compared the performance of logistic regression, classification trees, and random forest methods in classifying CUD. When comparing the performance of each method on the test data set, classification trees (AUC = 0.84, 95%CI: 0.82, 0.85) and random forest (AUC = 0.83, 95%CI: 0.82, 0.85) performed similarly and superior to logistic regression (AUC = 0.77, 95%CI: 0.74, 0.79). Results of the random forests reveal that marital status, risk propensity, age, and cocaine dependence variables contributed most to node purity, whereas model accuracy would decrease significantly if county type, income, race, and education variables were excluded from the model. One possible approach to improving the efficiency, interpretability, and clinical insights of CUD correlates is the employment of machine learning techniques.
期刊介绍:
Founded in 1961 to report on the latest work in psychiatry and cognate disciplines, the Journal of Psychiatric Research is dedicated to innovative and timely studies of four important areas of research:
(1) clinical studies of all disciplines relating to psychiatric illness, as well as normal human behaviour, including biochemical, physiological, genetic, environmental, social, psychological and epidemiological factors;
(2) basic studies pertaining to psychiatry in such fields as neuropsychopharmacology, neuroendocrinology, electrophysiology, genetics, experimental psychology and epidemiology;
(3) the growing application of clinical laboratory techniques in psychiatry, including imagery and spectroscopy of the brain, molecular biology and computer sciences;