{"title":"使用机器学习预测预期寿命的决定因素","authors":"B. Kouame Amos, I. V. Smirnov","doi":"10.23947/2687-1653-2022-22-4-373-383","DOIUrl":null,"url":null,"abstract":" Introduction. Life expectancy is, by definition, the average number of years a person can expect to live from birth to death. It is therefore the best indicator for assessing the health of human beings, but also a comprehensive index for assessing the level of economic development, education and health systems . From our extensive research, we have found that most existing studies contain qualitative analyses of one or a few factors. There is a lack of quantitative analyses of multiple factors, which leads to a situation where the predominant factor influencing life expectancy cannot be identified with precision. However, with the existence of various conditions and complications witnessed in society today, several factors need to be taken into consideration to predict life expectancy. Therefore, various machine learning models have been developed to predict life expectancy. The aim of this article is to identify the factors that determine life expectancy. Materials and Methods. Our research uses the Pearson correlation coefficient to assess correlations between indicators, and we use multiple linear regression models, Ridge regression, and Lasso regression to measure the impact of each indicator on life expectancy . For model selection, the Akaike information criterion, the coefficient of variation and the mean square error were used. R2 and the mean square error were used. Results. Based on these criteria, multiple linear regression was selected for the development of the life expectancy prediction model, as this model obtained the smallest Akaike information criterion of 6109.07, an adjusted coefficient of 85 % and an RMSE of 3.85. Conclusion and Discussion. At the end of our study, we concluded that the variables that best explain life expectancy are adult mortality, infant mortality, percentage of expenditure, measles, under-five mortality, polio, total expenditure, diphtheria, HIV / AIDS, GDP, longevity of 1.19 years, resource composition, and schooling. The results of this analysis can be used by the World Health Organization and the health sectors to improve society.","PeriodicalId":13758,"journal":{"name":"International Journal of Advanced Engineering Research and Science","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Determinants in Predicting Life Expectancy Using Machine Learning\",\"authors\":\"B. Kouame Amos, I. V. Smirnov\",\"doi\":\"10.23947/2687-1653-2022-22-4-373-383\",\"DOIUrl\":null,\"url\":null,\"abstract\":\" Introduction. Life expectancy is, by definition, the average number of years a person can expect to live from birth to death. It is therefore the best indicator for assessing the health of human beings, but also a comprehensive index for assessing the level of economic development, education and health systems . From our extensive research, we have found that most existing studies contain qualitative analyses of one or a few factors. There is a lack of quantitative analyses of multiple factors, which leads to a situation where the predominant factor influencing life expectancy cannot be identified with precision. However, with the existence of various conditions and complications witnessed in society today, several factors need to be taken into consideration to predict life expectancy. Therefore, various machine learning models have been developed to predict life expectancy. The aim of this article is to identify the factors that determine life expectancy. Materials and Methods. Our research uses the Pearson correlation coefficient to assess correlations between indicators, and we use multiple linear regression models, Ridge regression, and Lasso regression to measure the impact of each indicator on life expectancy . For model selection, the Akaike information criterion, the coefficient of variation and the mean square error were used. R2 and the mean square error were used. Results. Based on these criteria, multiple linear regression was selected for the development of the life expectancy prediction model, as this model obtained the smallest Akaike information criterion of 6109.07, an adjusted coefficient of 85 % and an RMSE of 3.85. Conclusion and Discussion. At the end of our study, we concluded that the variables that best explain life expectancy are adult mortality, infant mortality, percentage of expenditure, measles, under-five mortality, polio, total expenditure, diphtheria, HIV / AIDS, GDP, longevity of 1.19 years, resource composition, and schooling. The results of this analysis can be used by the World Health Organization and the health sectors to improve society.\",\"PeriodicalId\":13758,\"journal\":{\"name\":\"International Journal of Advanced Engineering Research and Science\",\"volume\":\"16 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Advanced Engineering Research and Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23947/2687-1653-2022-22-4-373-383\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advanced Engineering Research and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23947/2687-1653-2022-22-4-373-383","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Determinants in Predicting Life Expectancy Using Machine Learning
Introduction. Life expectancy is, by definition, the average number of years a person can expect to live from birth to death. It is therefore the best indicator for assessing the health of human beings, but also a comprehensive index for assessing the level of economic development, education and health systems . From our extensive research, we have found that most existing studies contain qualitative analyses of one or a few factors. There is a lack of quantitative analyses of multiple factors, which leads to a situation where the predominant factor influencing life expectancy cannot be identified with precision. However, with the existence of various conditions and complications witnessed in society today, several factors need to be taken into consideration to predict life expectancy. Therefore, various machine learning models have been developed to predict life expectancy. The aim of this article is to identify the factors that determine life expectancy. Materials and Methods. Our research uses the Pearson correlation coefficient to assess correlations between indicators, and we use multiple linear regression models, Ridge regression, and Lasso regression to measure the impact of each indicator on life expectancy . For model selection, the Akaike information criterion, the coefficient of variation and the mean square error were used. R2 and the mean square error were used. Results. Based on these criteria, multiple linear regression was selected for the development of the life expectancy prediction model, as this model obtained the smallest Akaike information criterion of 6109.07, an adjusted coefficient of 85 % and an RMSE of 3.85. Conclusion and Discussion. At the end of our study, we concluded that the variables that best explain life expectancy are adult mortality, infant mortality, percentage of expenditure, measles, under-five mortality, polio, total expenditure, diphtheria, HIV / AIDS, GDP, longevity of 1.19 years, resource composition, and schooling. The results of this analysis can be used by the World Health Organization and the health sectors to improve society.