Big Data Science in Building Medical Data Classifier Using Naïve Bayes Model

Kevin D'souza, Z. Ansari
{"title":"Big Data Science in Building Medical Data Classifier Using Naïve Bayes Model","authors":"Kevin D'souza, Z. Ansari","doi":"10.1109/CCEM.2018.00020","DOIUrl":null,"url":null,"abstract":"currently, maintenance of clinical databases has become a crucial task in the medical field. The patient data consisting of various features and diagnostics related to disease should be entered with the utmost care to provide quality services. As the data stored in medical databases may contain missing values and redundant data, mining of the medical data becomes cumbersome. As it can affect the results of mining, it is essential to have good data preparation and data reduction before applying data mining algorithms. Prediction of disease becomes quick and easier if data is precise and consistent and free from noise. One of the key specialty of Naive Bayes classifiers is that they are highly scalable, requiring a number of parameters linear in the number of variables (features/predictors) in a learning problem. Evaluation of closed-form expression can be achieved by Maximum-likelihood training. Which requires linear time, rather than by expensive iterative approximation as used for many other types of classifiers. This research uses data science approach to diognize the medical data. In this article, a study has been conducted by using naïve Bayes classifier to classify the medical data. The suitability of the classifier and the accuracy of the classifier are measured using different performance criteria. This study is useful for researchers and developers in understanding and using a classification technique in medical diagnosis.","PeriodicalId":156315,"journal":{"name":"2018 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM)","volume":"173 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCEM.2018.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

currently, maintenance of clinical databases has become a crucial task in the medical field. The patient data consisting of various features and diagnostics related to disease should be entered with the utmost care to provide quality services. As the data stored in medical databases may contain missing values and redundant data, mining of the medical data becomes cumbersome. As it can affect the results of mining, it is essential to have good data preparation and data reduction before applying data mining algorithms. Prediction of disease becomes quick and easier if data is precise and consistent and free from noise. One of the key specialty of Naive Bayes classifiers is that they are highly scalable, requiring a number of parameters linear in the number of variables (features/predictors) in a learning problem. Evaluation of closed-form expression can be achieved by Maximum-likelihood training. Which requires linear time, rather than by expensive iterative approximation as used for many other types of classifiers. This research uses data science approach to diognize the medical data. In this article, a study has been conducted by using naïve Bayes classifier to classify the medical data. The suitability of the classifier and the accuracy of the classifier are measured using different performance criteria. This study is useful for researchers and developers in understanding and using a classification technique in medical diagnosis.
使用Naïve贝叶斯模型构建医疗数据分类器的大数据科学
目前,临床数据库的维护已成为医学领域的一项重要任务。由各种疾病特征和诊断组成的患者数据应以最谨慎的方式输入,以提供优质的服务。由于存储在医疗数据库中的数据可能包含缺失值和冗余数据,使得医疗数据的挖掘变得十分繁琐。由于它会影响挖掘的结果,因此在应用数据挖掘算法之前,必须做好数据准备和数据约简。如果数据准确、一致且不受干扰,疾病预测就会变得快速和容易。朴素贝叶斯分类器的一个关键特点是它们具有高度可扩展性,在学习问题中需要许多变量(特征/预测器)数量线性的参数。封闭表达式的评价可以通过极大似然训练来实现。它需要线性时间,而不是像许多其他类型的分类器那样使用昂贵的迭代近似。本研究采用数据科学的方法对医疗数据进行识别。本文采用naïve贝叶斯分类器对医疗数据进行分类。使用不同的性能标准来衡量分类器的适用性和分类器的准确性。本研究有助于研究人员和开发人员理解和使用医学诊断中的分类技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信