Predicting the risk of becoming eligible for the disability pension: Machine learning methods applied to French health data

IF 0.3 4区 医学 Q4 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Corinne Mette, Dorian Verboux, Antoine Rachas, Gonzague Debeugny
{"title":"Predicting the risk of becoming eligible for the disability pension: Machine learning methods applied to French health data","authors":"Corinne Mette, Dorian Verboux, Antoine Rachas, Gonzague Debeugny","doi":"10.3917/spub.236.0065","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Benefiting from the disability pension implies morbid (physical and psychological) and social (fall in income) implications for the person. It also has economic consequences for society, with increasing expenses since 2011 (+4.9% on average per year). Investing in preventive actions against the loss of the ability to work should limit these consequences, but it requires targeting people at risk. The development of artificial intelligence opens up prospects in this regard.</p><p><strong>Purpose of the research: </strong>To target, using supervised machine learning methods, those people with a high probability of becoming eligible for the disability pension over the course of the year based on their socio-demographic and medical characteristics (pathologies, work stoppages, drugs taken, and medical procedures).</p><p><strong>Method: </strong>Among the beneficiaries of the French public welfare system aged 20–64 in 2017, we compared the socio-demographic and medical characteristics between 2014 and 2016 of those who received a disability pension in 2017 and not before, and those who did not receive a disability pension from 2014 to 2017. The determination of the boundary between these two groups was tested using logistic regression, decision trees, random forests, naive Bayes classifiers, and support vector machines. The models’ performance was compared with respect to accuracy, precision, sensitivity, specificity, and AUC (area under the curve). Finally, the predictive power of each factor was measured by AUC too.</p><p><strong>Results: </strong>The boosted logistic regression had the best performance for three of the five criteria, but low sensitivity. The best sensitivity was obtained with the support vector machines, with an accuracy close to that of the boosted logistic regression, but a lower precision and specificity. Random forests offered the best discriminatory ability. The naive Bayes classifier had the worst performance. The most predictive factors in becoming eligible for the disability pension were having 30 days or more off sick in 2014, 2015, and 2016 and being aged 55 to 64.</p><p><strong>Conclusion: </strong>Supervised learning methods have appeared relevant for identifying people with the highest probability of becoming eligible for the disability pension and, more broadly, for steering public and social policies.</p>","PeriodicalId":49575,"journal":{"name":"Sante Publique","volume":null,"pages":null},"PeriodicalIF":0.3000,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sante Publique","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3917/spub.236.0065","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Benefiting from the disability pension implies morbid (physical and psychological) and social (fall in income) implications for the person. It also has economic consequences for society, with increasing expenses since 2011 (+4.9% on average per year). Investing in preventive actions against the loss of the ability to work should limit these consequences, but it requires targeting people at risk. The development of artificial intelligence opens up prospects in this regard.

Purpose of the research: To target, using supervised machine learning methods, those people with a high probability of becoming eligible for the disability pension over the course of the year based on their socio-demographic and medical characteristics (pathologies, work stoppages, drugs taken, and medical procedures).

Method: Among the beneficiaries of the French public welfare system aged 20–64 in 2017, we compared the socio-demographic and medical characteristics between 2014 and 2016 of those who received a disability pension in 2017 and not before, and those who did not receive a disability pension from 2014 to 2017. The determination of the boundary between these two groups was tested using logistic regression, decision trees, random forests, naive Bayes classifiers, and support vector machines. The models’ performance was compared with respect to accuracy, precision, sensitivity, specificity, and AUC (area under the curve). Finally, the predictive power of each factor was measured by AUC too.

Results: The boosted logistic regression had the best performance for three of the five criteria, but low sensitivity. The best sensitivity was obtained with the support vector machines, with an accuracy close to that of the boosted logistic regression, but a lower precision and specificity. Random forests offered the best discriminatory ability. The naive Bayes classifier had the worst performance. The most predictive factors in becoming eligible for the disability pension were having 30 days or more off sick in 2014, 2015, and 2016 and being aged 55 to 64.

Conclusion: Supervised learning methods have appeared relevant for identifying people with the highest probability of becoming eligible for the disability pension and, more broadly, for steering public and social policies.

预测有资格领取残疾抚恤金的风险:应用于法国健康数据的机器学习方法
导言:领取残疾抚恤金意味着对个人造成病态(身体和心理)和社会(收入减少)影响。它还会对社会造成经济后果,自 2011 年以来,社会支出不断增加(平均每年增加 4.9%)。投资于预防丧失工作能力的行动应能限制这些后果,但这需要针对高危人群。人工智能的发展开辟了这方面的前景:研究目的:根据社会人口学和医学特征(病理、停工、服药和医疗程序),利用监督机器学习方法,锁定那些在一年中很有可能有资格领取残疾抚恤金的人:在 2017 年 20-64 岁的法国公共福利系统受益人中,我们比较了在 2017 年领取伤残抚恤金而之前未领取的人与在 2014 年至 2017 年未领取伤残抚恤金的人在 2014 年至 2016 年期间的社会人口和医疗特征。我们使用逻辑回归、决策树、随机森林、天真贝叶斯分类器和支持向量机测试了这两组之间边界的确定。对模型的准确性、精确性、灵敏度、特异性和 AUC(曲线下面积)进行了比较。最后,每个因子的预测能力也通过 AUC 进行了测量:在五项标准中,增强逻辑回归在三项标准上表现最佳,但灵敏度较低。支持向量机的灵敏度最高,准确率接近提升逻辑回归,但精确度和特异性较低。随机森林的判别能力最好。天真贝叶斯分类器的表现最差。2014、2015 和 2016 年请病假 30 天或以上以及 55 至 64 岁是最有资格领取残疾抚恤金的预测因素:监督学习方法似乎适用于识别最有可能成为残疾抚恤金领取者的人群,更广泛地说,适用于指导公共和社会政策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Sante Publique
Sante Publique PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH-
CiteScore
0.40
自引率
33.30%
发文量
252
审稿时长
>12 weeks
期刊介绍: La revue Santé Publique s’adresse à l’ensemble des acteurs de santé publique qu’ils soient décideurs, professionnels de santé, acteurs de terrain, chercheurs, enseignants ou formateurs, etc. Elle publie des travaux de recherche, des évaluations, des analyses d’action, des réflexions sur des interventions de santé, des opinions, relevant des champs de la santé publique et de l’analyse des services de soins, des sciences sociales et de l’action sociale. Santé publique est une revue à comité de lecture, multidisciplinaire et généraliste, qui publie sur l’ensemble des thèmes de la santé publique parmi lesquels : accès et recours aux soins, déterminants et inégalités sociales de santé, prévention, éducation pour la santé, promotion de la santé, organisation des soins, environnement, formation des professionnels de santé, nutrition, politiques de santé, pratiques professionnelles, qualité des soins, gestion des risques sanitaires, représentation et santé perçue, santé scolaire, santé et travail, systèmes de santé, systèmes d’information, veille sanitaire, déterminants de la consommation de soins, organisation et économie des différents secteurs de production de soins (hôpital, médicament, etc.), évaluation médico-économique d’activités de soins ou de prévention et de programmes de santé, planification des ressources, politiques de régulation et de financement, etc
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信