用机器学习方法预测孟加拉国儿童腹泻

Md. Maniruzzaman, Md. Shaykhul Islam, M. Abedin, M. Amanullah, Sadiq Hussain
{"title":"用机器学习方法预测孟加拉国儿童腹泻","authors":"Md. Maniruzzaman, Md. Shaykhul Islam, M. Abedin, M. Amanullah, Sadiq Hussain","doi":"10.36959/584/456","DOIUrl":null,"url":null,"abstract":"Diarrhea has remained a major health problem among under-five (U5) children that leads high level of morbidity and mortality. This study is to determine the socio-demographic risk factors of diarrhea as well as predict of diarrhea status using machine learning (ML) based approach among U5 children in Bangladesh. Bangladesh Demographic and Health Survey, 2014 dataset is used in this study. This dataset consisted of 7,538 respondents who had 371 (4.9%) child’s diarrhea. Logistic regression (LR) is used to determine the high-risk factors of diarrhea. Then four ML-based approach namely naïve Bayes (NB), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM) was applied to predict the child’s diarrhea status and accuracy, sensitivity, and specificity are used to evaluate the performance of these classifiers. Around 4.9% women reported that their children have experienced an episode of diarrhea in two weeks before the survey. LR model showed that the child’s age, region (Khulna and Rangpur), mothers who had completed secondary education, and respondents who were rich wealth index, significantly associated risk factors for diarrhea disease. Our findings indicate that SVM with radial basis kernel yielded 65.61% accuracy, 66.27% sensitivity, and 52.28% specificity which are comparatively better than others. The prevalence of diarrhea disease is more common among Bangladeshi children. Our study shows that SVM is capable of predicting child diarrhea status (generally highly imbalanced data). This study allows policy makers towards appropriate decisions to reduce childhood diarrhea in Bangladesh.","PeriodicalId":92909,"journal":{"name":"Insights of biomedical research","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Prediction of Childhood Diarrhea in Bangladesh using Machine Learning Approach\",\"authors\":\"Md. Maniruzzaman, Md. Shaykhul Islam, M. Abedin, M. Amanullah, Sadiq Hussain\",\"doi\":\"10.36959/584/456\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Diarrhea has remained a major health problem among under-five (U5) children that leads high level of morbidity and mortality. This study is to determine the socio-demographic risk factors of diarrhea as well as predict of diarrhea status using machine learning (ML) based approach among U5 children in Bangladesh. Bangladesh Demographic and Health Survey, 2014 dataset is used in this study. This dataset consisted of 7,538 respondents who had 371 (4.9%) child’s diarrhea. Logistic regression (LR) is used to determine the high-risk factors of diarrhea. Then four ML-based approach namely naïve Bayes (NB), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM) was applied to predict the child’s diarrhea status and accuracy, sensitivity, and specificity are used to evaluate the performance of these classifiers. Around 4.9% women reported that their children have experienced an episode of diarrhea in two weeks before the survey. LR model showed that the child’s age, region (Khulna and Rangpur), mothers who had completed secondary education, and respondents who were rich wealth index, significantly associated risk factors for diarrhea disease. Our findings indicate that SVM with radial basis kernel yielded 65.61% accuracy, 66.27% sensitivity, and 52.28% specificity which are comparatively better than others. The prevalence of diarrhea disease is more common among Bangladeshi children. Our study shows that SVM is capable of predicting child diarrhea status (generally highly imbalanced data). This study allows policy makers towards appropriate decisions to reduce childhood diarrhea in Bangladesh.\",\"PeriodicalId\":92909,\"journal\":{\"name\":\"Insights of biomedical research\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Insights of biomedical research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.36959/584/456\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Insights of biomedical research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.36959/584/456","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

腹泻仍然是五岁以下儿童的主要健康问题,导致高发病率和高死亡率。本研究旨在确定孟加拉国U5儿童腹泻的社会人口学风险因素,并使用基于机器学习(ML)的方法预测腹泻状况。本研究使用2014年孟加拉国人口与健康调查数据集。该数据集由7538名受访者组成,他们有371名(4.9%)儿童腹泻。Logistic回归(LR)用于确定腹泻的高危因素。然后应用四种基于ML的方法,即朴素贝叶斯(NB)、线性判别分析(LDA)、二次判别分析(QDA)和支持向量机(SVM)来预测儿童腹泻状态,并使用准确性、敏感性和特异性来评估这些分类器的性能。约4.9%的女性报告称,她们的孩子在调查前两周内出现腹泻。LR模型显示,孩子的年龄、地区(库尔纳和朗布尔)、完成中等教育的母亲和富有指数的受访者与腹泻疾病的风险因素显著相关。我们的研究结果表明,径向基核支持向量机的准确率为65.61%,灵敏度为66.27%,特异性为52.28%,相对而言优于其他支持向量机。腹泻病在孟加拉国儿童中更为常见。我们的研究表明,支持向量机能够预测儿童腹泻状态(通常是高度不平衡的数据)。这项研究使政策制定者能够做出适当的决定来减少孟加拉国儿童腹泻。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Prediction of Childhood Diarrhea in Bangladesh using Machine Learning Approach
Diarrhea has remained a major health problem among under-five (U5) children that leads high level of morbidity and mortality. This study is to determine the socio-demographic risk factors of diarrhea as well as predict of diarrhea status using machine learning (ML) based approach among U5 children in Bangladesh. Bangladesh Demographic and Health Survey, 2014 dataset is used in this study. This dataset consisted of 7,538 respondents who had 371 (4.9%) child’s diarrhea. Logistic regression (LR) is used to determine the high-risk factors of diarrhea. Then four ML-based approach namely naïve Bayes (NB), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM) was applied to predict the child’s diarrhea status and accuracy, sensitivity, and specificity are used to evaluate the performance of these classifiers. Around 4.9% women reported that their children have experienced an episode of diarrhea in two weeks before the survey. LR model showed that the child’s age, region (Khulna and Rangpur), mothers who had completed secondary education, and respondents who were rich wealth index, significantly associated risk factors for diarrhea disease. Our findings indicate that SVM with radial basis kernel yielded 65.61% accuracy, 66.27% sensitivity, and 52.28% specificity which are comparatively better than others. The prevalence of diarrhea disease is more common among Bangladeshi children. Our study shows that SVM is capable of predicting child diarrhea status (generally highly imbalanced data). This study allows policy makers towards appropriate decisions to reduce childhood diarrhea in Bangladesh.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信