Determinants in Predicting Life Expectancy Using Machine Learning

B. Kouame Amos, I. V. Smirnov
{"title":"Determinants in Predicting Life Expectancy Using Machine Learning","authors":"B. Kouame Amos, I. V. Smirnov","doi":"10.23947/2687-1653-2022-22-4-373-383","DOIUrl":null,"url":null,"abstract":"   Introduction. Life expectancy is, by definition, the average number of years a person can expect to live from birth to death. It is therefore the best indicator for assessing the health of  human beings, but also a comprehensive index for assessing the level of economic development, education and health systems . From our extensive research, we have found that most existing studies contain qualitative analyses of one or a few factors. There is a lack of quantitative analyses of multiple factors, which leads to a situation where the predominant factor influencing life expectancy cannot be identified with precision. However, with the existence of various conditions and complications witnessed in society today, several factors need to be taken into consideration to predict life expectancy. Therefore, various machine learning models have been developed to predict life expectancy.   The aim of this article is to identify the factors that determine life expectancy.   Materials and Methods. Our research uses the  Pearson  correlation coefficient  to assess correlations between indicators, and we use multiple linear regression models,  Ridge regression, and Lasso regression  to measure the impact of each indicator on  life expectancy .  For model selection, the Akaike information criterion, the coefficient of variation and the mean square error were used. R2 and the mean square error were used.   Results. Based on these criteria, multiple linear regression was selected for the development of the life expectancy prediction model, as this model obtained the smallest Akaike information criterion of 6109.07, an adjusted coefficient of 85 % and an RMSE of 3.85.   Conclusion and Discussion. At the end of our study, we concluded that the variables that best explain life expectancy are adult mortality, infant mortality, percentage of expenditure, measles, under-five mortality, polio, total expenditure, diphtheria, HIV / AIDS, GDP, longevity of 1.19 years, resource composition, and schooling. The results of this analysis can be used by the World Health Organization and the health sectors to improve society.","PeriodicalId":13758,"journal":{"name":"International Journal of Advanced Engineering Research and Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advanced Engineering Research and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23947/2687-1653-2022-22-4-373-383","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

   Introduction. Life expectancy is, by definition, the average number of years a person can expect to live from birth to death. It is therefore the best indicator for assessing the health of  human beings, but also a comprehensive index for assessing the level of economic development, education and health systems . From our extensive research, we have found that most existing studies contain qualitative analyses of one or a few factors. There is a lack of quantitative analyses of multiple factors, which leads to a situation where the predominant factor influencing life expectancy cannot be identified with precision. However, with the existence of various conditions and complications witnessed in society today, several factors need to be taken into consideration to predict life expectancy. Therefore, various machine learning models have been developed to predict life expectancy.   The aim of this article is to identify the factors that determine life expectancy.   Materials and Methods. Our research uses the  Pearson  correlation coefficient  to assess correlations between indicators, and we use multiple linear regression models,  Ridge regression, and Lasso regression  to measure the impact of each indicator on  life expectancy .  For model selection, the Akaike information criterion, the coefficient of variation and the mean square error were used. R2 and the mean square error were used.   Results. Based on these criteria, multiple linear regression was selected for the development of the life expectancy prediction model, as this model obtained the smallest Akaike information criterion of 6109.07, an adjusted coefficient of 85 % and an RMSE of 3.85.   Conclusion and Discussion. At the end of our study, we concluded that the variables that best explain life expectancy are adult mortality, infant mortality, percentage of expenditure, measles, under-five mortality, polio, total expenditure, diphtheria, HIV / AIDS, GDP, longevity of 1.19 years, resource composition, and schooling. The results of this analysis can be used by the World Health Organization and the health sectors to improve society.
使用机器学习预测预期寿命的决定因素
介绍。根据定义,预期寿命是指一个人从出生到死亡的平均寿命。因此,它是评估人类健康的最佳指标,也是评估经济发展、教育和卫生系统水平的综合指标。从我们广泛的研究中,我们发现大多数现有的研究都包含对一个或几个因素的定性分析。缺乏对多因素的定量分析,导致无法准确确定影响预期寿命的主要因素。然而,随着当今社会中各种疾病和并发症的存在,预测预期寿命需要考虑几个因素。因此,人们开发了各种机器学习模型来预测预期寿命。本文的目的是找出决定预期寿命的因素。材料与方法。我们的研究使用Pearson相关系数来评估指标之间的相关性,并使用多元线性回归模型、Ridge回归和Lasso回归来衡量每个指标对预期寿命的影响。模型选择采用赤池信息准则、变异系数和均方误差。采用R2和均方误差。结果。在此基础上,选择多元线性回归建立预期寿命预测模型,该模型获得了最小的赤池信息准则6109.07,调整系数为85%,RMSE为3.85。结论与讨论。在我们的研究结束时,我们得出结论,最能解释预期寿命的变量是成人死亡率、婴儿死亡率、支出百分比、麻疹、五岁以下儿童死亡率、脊髓灰质炎、总支出、白喉、艾滋病毒/艾滋病、GDP、1.19岁的寿命、资源构成和学校教育。世界卫生组织和卫生部门可以利用这一分析的结果来改善社会。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信