Why the Naive Bayes approximation is not as Naive as it appears

C. Stephens, Hugo Flores, Ana Ruiz Linares
{"title":"Why the Naive Bayes approximation is not as Naive as it appears","authors":"C. Stephens, Hugo Flores, Ana Ruiz Linares","doi":"10.1109/IISA.2015.7388083","DOIUrl":null,"url":null,"abstract":"The Naive Bayes approximation and associated classifier is widely used in machine learning and data mining and offers very robust performance across a large spectrum of problem domains. As it depends on a very strong assumption - independence among features - this has been somewhat puzzling. Various hypotheses have been put forward to explain its success and moreover many generalizations have been proposed. In this paper we propose a set of \"local\" error measures - associated with the likelihood functions for particular subsets of attributes and for each class - and show explicitly how these local errors combine to give a \"global\" error associated to the full attribute set. By so doing we formulate a framework within which the phenomenon of error cancelation, or augmentation, can be quantitatively evaluated and its impact on classifier performance estimated and predicted a priori. These diagnostics also allow us to develop a deeper and more quantitative understanding of why the Naive Bayes approximation is so robust and under what circumstances one expects it to break down.","PeriodicalId":433872,"journal":{"name":"2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISA.2015.7388083","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The Naive Bayes approximation and associated classifier is widely used in machine learning and data mining and offers very robust performance across a large spectrum of problem domains. As it depends on a very strong assumption - independence among features - this has been somewhat puzzling. Various hypotheses have been put forward to explain its success and moreover many generalizations have been proposed. In this paper we propose a set of "local" error measures - associated with the likelihood functions for particular subsets of attributes and for each class - and show explicitly how these local errors combine to give a "global" error associated to the full attribute set. By so doing we formulate a framework within which the phenomenon of error cancelation, or augmentation, can be quantitatively evaluated and its impact on classifier performance estimated and predicted a priori. These diagnostics also allow us to develop a deeper and more quantitative understanding of why the Naive Bayes approximation is so robust and under what circumstances one expects it to break down.
为什么朴素贝叶斯近似不像看起来那么朴素
朴素贝叶斯近似和相关分类器被广泛应用于机器学习和数据挖掘,并在大范围的问题领域提供了非常强大的性能。由于它依赖于一个非常强的假设——特征之间的独立性——这有点令人困惑。人们提出了各种各样的假设来解释它的成功,并且提出了许多概括。在本文中,我们提出了一组“局部”误差度量——与属性的特定子集和每个类的似然函数相关联——并明确地展示了这些局部误差是如何结合起来给出与完整属性集相关的“全局”误差的。通过这样做,我们制定了一个框架,可以定量评估误差消除或增强现象,并先验地估计和预测其对分类器性能的影响。这些诊断还使我们能够更深入、更定量地理解为什么朴素贝叶斯近似如此鲁棒,以及在什么情况下它会崩溃。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信