{"title":"利用日本匿名生活普查数据进行可解释的机器学习分析,以识别糖尿病风险因素。","authors":"Pei Jiang, Hiroyuki Suzuki, Takashi Obi","doi":"10.1007/s12553-023-00730-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Diabetes mellitus causes various problems in our life. With the big data boom in our society, some risk factors for Diabetes must still exist. To identify new risk factors for diabetes in the big data society and explore further efficient use of big data, the non-objective-oriented census data about the Japanese Citizen's Survey of Living Conditions were analyzed using interpretable machine learning methods.</p><p><strong>Methods: </strong>Seven interpretable machine learning methods were used to analysis Japan citizens' census data. Firstly, logistic analysis was used to analyze the risk factors of diabetes from 19 selected initial elements. Then, the linear analysis, linear discriminate analysis, Hayashi's quantification analysis method 2, random forest, XGBoost, and SHAP methods were used to re-check and find the different factor contributions. Finally, the relationship among the factors was analyzed to understand the relationship among factors.</p><p><strong>Results: </strong>Four new risk factors: the number of family members, insurance type, public pension type, and health awareness level, were found as risk factors for diabetes mellitus for the first time, while another 11 risk factors were reconfirmed in this analysis. Especially the insurance type factor and health awareness level factor make more contributions to diabetes than factors: hypertension, hyperlipidemia, and stress in some interpretable models. We also found that work years were identified as a risk factor for diabetes because it has a high coefficient with the risk factor of age.</p><p><strong>Conclusions: </strong>New risk factors for diabetes mellitus were identified based on Japan's non-objective-oriented anonymous census data using interpretable machine learning models. The newly identified risk factors inspire new possible policies for preventing diabetes. Moreover, our analysis certifies that big data can help us find helpful knowledge in today's prosperous society. Our study also paves the way for identifying more risk factors and promoting the efficiency of using big data.</p>","PeriodicalId":12941,"journal":{"name":"Health and Technology","volume":"13 1","pages":"119-131"},"PeriodicalIF":3.1000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9876749/pdf/","citationCount":"0","resultStr":"{\"title\":\"Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan.\",\"authors\":\"Pei Jiang, Hiroyuki Suzuki, Takashi Obi\",\"doi\":\"10.1007/s12553-023-00730-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Diabetes mellitus causes various problems in our life. With the big data boom in our society, some risk factors for Diabetes must still exist. To identify new risk factors for diabetes in the big data society and explore further efficient use of big data, the non-objective-oriented census data about the Japanese Citizen's Survey of Living Conditions were analyzed using interpretable machine learning methods.</p><p><strong>Methods: </strong>Seven interpretable machine learning methods were used to analysis Japan citizens' census data. Firstly, logistic analysis was used to analyze the risk factors of diabetes from 19 selected initial elements. Then, the linear analysis, linear discriminate analysis, Hayashi's quantification analysis method 2, random forest, XGBoost, and SHAP methods were used to re-check and find the different factor contributions. Finally, the relationship among the factors was analyzed to understand the relationship among factors.</p><p><strong>Results: </strong>Four new risk factors: the number of family members, insurance type, public pension type, and health awareness level, were found as risk factors for diabetes mellitus for the first time, while another 11 risk factors were reconfirmed in this analysis. Especially the insurance type factor and health awareness level factor make more contributions to diabetes than factors: hypertension, hyperlipidemia, and stress in some interpretable models. We also found that work years were identified as a risk factor for diabetes because it has a high coefficient with the risk factor of age.</p><p><strong>Conclusions: </strong>New risk factors for diabetes mellitus were identified based on Japan's non-objective-oriented anonymous census data using interpretable machine learning models. The newly identified risk factors inspire new possible policies for preventing diabetes. Moreover, our analysis certifies that big data can help us find helpful knowledge in today's prosperous society. Our study also paves the way for identifying more risk factors and promoting the efficiency of using big data.</p>\",\"PeriodicalId\":12941,\"journal\":{\"name\":\"Health and Technology\",\"volume\":\"13 1\",\"pages\":\"119-131\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9876749/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Health and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s12553-023-00730-w\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/1/26 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s12553-023-00730-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/26 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan.
Purpose: Diabetes mellitus causes various problems in our life. With the big data boom in our society, some risk factors for Diabetes must still exist. To identify new risk factors for diabetes in the big data society and explore further efficient use of big data, the non-objective-oriented census data about the Japanese Citizen's Survey of Living Conditions were analyzed using interpretable machine learning methods.
Methods: Seven interpretable machine learning methods were used to analysis Japan citizens' census data. Firstly, logistic analysis was used to analyze the risk factors of diabetes from 19 selected initial elements. Then, the linear analysis, linear discriminate analysis, Hayashi's quantification analysis method 2, random forest, XGBoost, and SHAP methods were used to re-check and find the different factor contributions. Finally, the relationship among the factors was analyzed to understand the relationship among factors.
Results: Four new risk factors: the number of family members, insurance type, public pension type, and health awareness level, were found as risk factors for diabetes mellitus for the first time, while another 11 risk factors were reconfirmed in this analysis. Especially the insurance type factor and health awareness level factor make more contributions to diabetes than factors: hypertension, hyperlipidemia, and stress in some interpretable models. We also found that work years were identified as a risk factor for diabetes because it has a high coefficient with the risk factor of age.
Conclusions: New risk factors for diabetes mellitus were identified based on Japan's non-objective-oriented anonymous census data using interpretable machine learning models. The newly identified risk factors inspire new possible policies for preventing diabetes. Moreover, our analysis certifies that big data can help us find helpful knowledge in today's prosperous society. Our study also paves the way for identifying more risk factors and promoting the efficiency of using big data.
期刊介绍:
Health and Technology is the first truly cross-disciplinary journal on issues related to health technologies addressing all professions relating to health, care and health technology.The journal constitutes an information platform connecting medical technology and informatics with the needs of care, health care professionals and patients. Thus, medical physicists and biomedical/clinical engineers are encouraged to write articles not only for their colleagues, but directed to all other groups of readers as well, and vice versa.By its nature, the journal presents and discusses hot subjects including but not limited to patient safety, patient empowerment, disease surveillance and management, e-health and issues concerning data security, privacy, reliability and management, data mining and knowledge exchange as well as health prevention. The journal also addresses the medical, financial, social, educational and safety aspects of health technologies as well as health technology assessment and management, including issues such security, efficacy, cost in comparison to the benefit, as well as social, legal and ethical implications.This journal is a communicative source for the health work force (physicians, nurses, medical physicists, clinical engineers, biomedical engineers, hospital engineers, etc.), the ministries of health, hospital management, self-employed doctors, health care providers and regulatory agencies, the medical technology industry, patients'' associations, universities (biomedical and clinical engineering, medical physics, medical informatics, biology, medicine and public health as well as health economics programs), research institutes and professional, scientific and technical organizations.Health and Technology is jointly published by Springer and the IUPESM (International Union for Physical and Engineering Sciences in Medicine) in cooperation with the World Health Organization.