K Y Guo, Y X Zhu, Y X Zhang, C Yang, H Zhao, Y L Jin
{"title":"钢铁工人高血压风险预测模型研究","authors":"K Y Guo, Y X Zhu, Y X Zhang, C Yang, H Zhao, Y L Jin","doi":"10.3760/cma.j.cn121094-20240517-00223","DOIUrl":null,"url":null,"abstract":"<p><p><b>Objective:</b> To identify risk factors influencing the incidence of hypertension among steelworkers (Homo sapiens) and establish an effective and easily implementable hypertension prediction model. <b>Methods:</b> In September 2023, 2214 steelworkers (Homo sapiens) were selected as study subjects. Basic demographic information, lifestyle, and occupational exposure data were collected, along with physiological measurements such as height, weight, and blood pressure. Multivariate unconditional logistic regression analysis was employed based on relevant literature to determine influencing factors for hypertension among steelworkers (Homo sapiens). Python 3.9 software was used to construct and compare logistic regression, support vector machine (SVM), random forest, extreme gradient boosting tree (XGBoost), and LGBM models. Model performance was evaluated using metrics such as receiver operating characteristic (ROC) curves, accuracy, calibration curves, and F1 scores. The Shapley Additive Explanations (SHAP) model was introduced for feature importance analysis to enhance the interpretability of the prediction model. <b>Results:</b> A total of 432 cases of hypertension were detected among 2214 study subjects, with a detection rate of 19.51%. Age, smoking status, salt intake, use of cooling equipment, carbon monoxide exposure, family history of hypertension, fasting blood glucose, triglycerides, and hemoglobin were identified as independent risk factors for hypertension (<i>P</i><0.05). A comparison of the five models revealed the following performance metrics: logistic regression achieved an accuracy of 0.853, F1 score of 0.680, Brier score of 0.108, and AUC of 0.907; SVM demonstrated an accuracy of 0.863, F1 score of 0.687, Brier score of 0.081, and AUC of 0.910; random forest showed an accuracy of 0.857, F1 score of 0.603, Brier score of 0.105, and AUC of 0.861; XGBoost yielded an accuracy of 0.850, F1 score of 0.684, Brier score of 0.117, and AUC of 0.899; and the LGBM model exhibited an accuracy of 0.838, F1 score of 0.625, Brier score of 0.112, and AUC of 0.870. <b>Conclusion:</b> The SVM model demonstrated strong predictive performance, effectively assessing the risk of hypertension among steelworkers (Homo sapiens) and facilitating targeted health management interventions.</p>","PeriodicalId":23958,"journal":{"name":"中华劳动卫生职业病杂志","volume":"43 8","pages":"573-579"},"PeriodicalIF":0.0000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"[Study on risk prediction model of hypertension in steel workers].\",\"authors\":\"K Y Guo, Y X Zhu, Y X Zhang, C Yang, H Zhao, Y L Jin\",\"doi\":\"10.3760/cma.j.cn121094-20240517-00223\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Objective:</b> To identify risk factors influencing the incidence of hypertension among steelworkers (Homo sapiens) and establish an effective and easily implementable hypertension prediction model. <b>Methods:</b> In September 2023, 2214 steelworkers (Homo sapiens) were selected as study subjects. Basic demographic information, lifestyle, and occupational exposure data were collected, along with physiological measurements such as height, weight, and blood pressure. Multivariate unconditional logistic regression analysis was employed based on relevant literature to determine influencing factors for hypertension among steelworkers (Homo sapiens). Python 3.9 software was used to construct and compare logistic regression, support vector machine (SVM), random forest, extreme gradient boosting tree (XGBoost), and LGBM models. Model performance was evaluated using metrics such as receiver operating characteristic (ROC) curves, accuracy, calibration curves, and F1 scores. The Shapley Additive Explanations (SHAP) model was introduced for feature importance analysis to enhance the interpretability of the prediction model. <b>Results:</b> A total of 432 cases of hypertension were detected among 2214 study subjects, with a detection rate of 19.51%. Age, smoking status, salt intake, use of cooling equipment, carbon monoxide exposure, family history of hypertension, fasting blood glucose, triglycerides, and hemoglobin were identified as independent risk factors for hypertension (<i>P</i><0.05). A comparison of the five models revealed the following performance metrics: logistic regression achieved an accuracy of 0.853, F1 score of 0.680, Brier score of 0.108, and AUC of 0.907; SVM demonstrated an accuracy of 0.863, F1 score of 0.687, Brier score of 0.081, and AUC of 0.910; random forest showed an accuracy of 0.857, F1 score of 0.603, Brier score of 0.105, and AUC of 0.861; XGBoost yielded an accuracy of 0.850, F1 score of 0.684, Brier score of 0.117, and AUC of 0.899; and the LGBM model exhibited an accuracy of 0.838, F1 score of 0.625, Brier score of 0.112, and AUC of 0.870. <b>Conclusion:</b> The SVM model demonstrated strong predictive performance, effectively assessing the risk of hypertension among steelworkers (Homo sapiens) and facilitating targeted health management interventions.</p>\",\"PeriodicalId\":23958,\"journal\":{\"name\":\"中华劳动卫生职业病杂志\",\"volume\":\"43 8\",\"pages\":\"573-579\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"中华劳动卫生职业病杂志\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3760/cma.j.cn121094-20240517-00223\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"中华劳动卫生职业病杂志","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3760/cma.j.cn121094-20240517-00223","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
[Study on risk prediction model of hypertension in steel workers].
Objective: To identify risk factors influencing the incidence of hypertension among steelworkers (Homo sapiens) and establish an effective and easily implementable hypertension prediction model. Methods: In September 2023, 2214 steelworkers (Homo sapiens) were selected as study subjects. Basic demographic information, lifestyle, and occupational exposure data were collected, along with physiological measurements such as height, weight, and blood pressure. Multivariate unconditional logistic regression analysis was employed based on relevant literature to determine influencing factors for hypertension among steelworkers (Homo sapiens). Python 3.9 software was used to construct and compare logistic regression, support vector machine (SVM), random forest, extreme gradient boosting tree (XGBoost), and LGBM models. Model performance was evaluated using metrics such as receiver operating characteristic (ROC) curves, accuracy, calibration curves, and F1 scores. The Shapley Additive Explanations (SHAP) model was introduced for feature importance analysis to enhance the interpretability of the prediction model. Results: A total of 432 cases of hypertension were detected among 2214 study subjects, with a detection rate of 19.51%. Age, smoking status, salt intake, use of cooling equipment, carbon monoxide exposure, family history of hypertension, fasting blood glucose, triglycerides, and hemoglobin were identified as independent risk factors for hypertension (P<0.05). A comparison of the five models revealed the following performance metrics: logistic regression achieved an accuracy of 0.853, F1 score of 0.680, Brier score of 0.108, and AUC of 0.907; SVM demonstrated an accuracy of 0.863, F1 score of 0.687, Brier score of 0.081, and AUC of 0.910; random forest showed an accuracy of 0.857, F1 score of 0.603, Brier score of 0.105, and AUC of 0.861; XGBoost yielded an accuracy of 0.850, F1 score of 0.684, Brier score of 0.117, and AUC of 0.899; and the LGBM model exhibited an accuracy of 0.838, F1 score of 0.625, Brier score of 0.112, and AUC of 0.870. Conclusion: The SVM model demonstrated strong predictive performance, effectively assessing the risk of hypertension among steelworkers (Homo sapiens) and facilitating targeted health management interventions.