Neda Amoori, Bahman Cheraghian, Payam Amini, Seyed Mohammad Alavi
{"title":"Identification of Risk Factors Associated with Tuberculosis in Southwest Iran: A Machine Learning Method.","authors":"Neda Amoori, Bahman Cheraghian, Payam Amini, Seyed Mohammad Alavi","doi":"10.47176/mjiri.38.5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Tuberculosis is a principal public health issue. Reducing and controlling tuberculosis did not result in the expected success despite implementing effective preventive and therapeutic programs, one of the reasons for which is the delay in definitive diagnosis. Therefore, creating a diagnostic aid system for tuberculosis screening can help in the early diagnosis of this disease. This research aims to use machine learning techniques to identify economic, social, and environmental factors affecting tuberculosis.</p><p><strong>Methods: </strong>This case-control study included 80 individuals with TB and 172 participants as controls. During January-October 2021, information was collected from thirty-six health centers in Ahvaz, southwest Iran. Five different machine learning approaches were used to identify factors associated with TB, including BMI, sex, age , marital status, education, employment status, size of the family, monthly income, cigarette smoking, hookah smoking, history of chronic illness, history of imprisonment, history of hospital admission, first-class family, second-class family, third-class family, friend, co-worker, neighbor, market, store, hospital, health center, workplace, restaurant, park, mosque, Basij base, Hairdressers and school. The data was analyzed using the statistical programming R software version 4.1.1.</p><p><strong>Results: </strong>According to the calculated evaluation criteria, the accuracy level of 5 SVM, RF, LSSVM, KNN, and NB models is 0.99, 0.72, 0.97,0.99, and 0.95, respectively, and except for RF, the other models had the highest accuracy. Among the 39 investigated variables, 16 factors including First-class family (20.83%), friend (17.01%), health center (41.67%), hospital (24.74%), store (18.49%), market (14.32%), workplace (9.46%), history of hospital admission (51.82%), BMI (43.75%), sex (40.36%), age (22.83%), educational status (60.59%), employment status (43.58%), monthly income (63.80%), addiction (44.10%), history of imprisonment (38.19%) were of the highest importance on tuberculosis.</p><p><strong>Conclusion: </strong>The obtained results demonstrated that machine-learning techniques are effective in identifying economic, social, and environmental factors associated with tuberculosis. Identifying these different factors plays a significant role in preventing and performing appropriate and timely interventions to control this disease.</p>","PeriodicalId":18361,"journal":{"name":"Medical Journal of the Islamic Republic of Iran","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10907055/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Journal of the Islamic Republic of Iran","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47176/mjiri.38.5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Tuberculosis is a principal public health issue. Reducing and controlling tuberculosis did not result in the expected success despite implementing effective preventive and therapeutic programs, one of the reasons for which is the delay in definitive diagnosis. Therefore, creating a diagnostic aid system for tuberculosis screening can help in the early diagnosis of this disease. This research aims to use machine learning techniques to identify economic, social, and environmental factors affecting tuberculosis.
Methods: This case-control study included 80 individuals with TB and 172 participants as controls. During January-October 2021, information was collected from thirty-six health centers in Ahvaz, southwest Iran. Five different machine learning approaches were used to identify factors associated with TB, including BMI, sex, age , marital status, education, employment status, size of the family, monthly income, cigarette smoking, hookah smoking, history of chronic illness, history of imprisonment, history of hospital admission, first-class family, second-class family, third-class family, friend, co-worker, neighbor, market, store, hospital, health center, workplace, restaurant, park, mosque, Basij base, Hairdressers and school. The data was analyzed using the statistical programming R software version 4.1.1.
Results: According to the calculated evaluation criteria, the accuracy level of 5 SVM, RF, LSSVM, KNN, and NB models is 0.99, 0.72, 0.97,0.99, and 0.95, respectively, and except for RF, the other models had the highest accuracy. Among the 39 investigated variables, 16 factors including First-class family (20.83%), friend (17.01%), health center (41.67%), hospital (24.74%), store (18.49%), market (14.32%), workplace (9.46%), history of hospital admission (51.82%), BMI (43.75%), sex (40.36%), age (22.83%), educational status (60.59%), employment status (43.58%), monthly income (63.80%), addiction (44.10%), history of imprisonment (38.19%) were of the highest importance on tuberculosis.
Conclusion: The obtained results demonstrated that machine-learning techniques are effective in identifying economic, social, and environmental factors associated with tuberculosis. Identifying these different factors plays a significant role in preventing and performing appropriate and timely interventions to control this disease.