精确贝叶斯分类器:结果总结

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI:10.1109/ICDM51629.2021.00076

Amin Vahedian, Xun Zhou

{"title":"精确贝叶斯分类器:结果总结","authors":"Amin Vahedian, Xun Zhou","doi":"10.1109/ICDM51629.2021.00076","DOIUrl":null,"url":null,"abstract":"The Bayes Classifier is shown to have the minimal classification error, in addition to interpretable predictions. However, it requires the knowledge of underlying distributions of the predictors to be usable. This requirement is almost never satisfied. Naive Bayes classifiers and variants estimate this classifier by assuming the independence among predictors. This restrictive assumption hinders both the accuracy of these classifiers and their interpretability, as the calculated probabilities become less reliable. Moreover, it is argued in the literature that interpretability comes at the expense of accuracy and vice versa. In this paper, we are motivated by the accurate and interpretable nature of the Bayes Classifier. We propose Precise Bayes, which is a computationally efficient estimation of the Bayes Classifier based on a new formulation. Our method makes no assumptions, neither on independence nor on underlying distributions. We devise a new theoretical minimal error rate for our formulation and show that the error rate of Precise Bayes approaches this limit with increasing number of samples learned. Moreover, the calculated posterior probabilities, are actual empirical probabilities calculated by counting the observations and outcomes. This makes the predictions made by Precise Bayes fully explainable. Our evaluations on generated datasets and real datasets validate our theoretical claims on prediction error rate and computational efficiency.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Precise Bayes Classifier: Summary of Results\",\"authors\":\"Amin Vahedian, Xun Zhou\",\"doi\":\"10.1109/ICDM51629.2021.00076\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Bayes Classifier is shown to have the minimal classification error, in addition to interpretable predictions. However, it requires the knowledge of underlying distributions of the predictors to be usable. This requirement is almost never satisfied. Naive Bayes classifiers and variants estimate this classifier by assuming the independence among predictors. This restrictive assumption hinders both the accuracy of these classifiers and their interpretability, as the calculated probabilities become less reliable. Moreover, it is argued in the literature that interpretability comes at the expense of accuracy and vice versa. In this paper, we are motivated by the accurate and interpretable nature of the Bayes Classifier. We propose Precise Bayes, which is a computationally efficient estimation of the Bayes Classifier based on a new formulation. Our method makes no assumptions, neither on independence nor on underlying distributions. We devise a new theoretical minimal error rate for our formulation and show that the error rate of Precise Bayes approaches this limit with increasing number of samples learned. Moreover, the calculated posterior probabilities, are actual empirical probabilities calculated by counting the observations and outcomes. This makes the predictions made by Precise Bayes fully explainable. Our evaluations on generated datasets and real datasets validate our theoretical claims on prediction error rate and computational efficiency.\",\"PeriodicalId\":320970,\"journal\":{\"name\":\"2021 IEEE International Conference on Data Mining (ICDM)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Data Mining (ICDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM51629.2021.00076\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

除了可解释的预测外，贝叶斯分类器具有最小的分类误差。但是，它需要了解预测器的潜在分布才能使用。这一要求几乎从未得到满足。朴素贝叶斯分类器和变量通过假设预测器之间的独立性来估计该分类器。这种限制性假设既阻碍了这些分类器的准确性，也阻碍了它们的可解释性，因为计算的概率变得不那么可靠。此外，文献中还认为，可解释性是以牺牲准确性为代价的，反之亦然。在本文中，我们的动机是贝叶斯分类器的准确性和可解释性。我们提出了精确贝叶斯，这是一种基于新公式的贝叶斯分类器的计算效率估计。我们的方法不做任何假设，无论是独立性还是潜在的分布。我们为我们的公式设计了一个新的理论最小错误率，并表明精确贝叶斯的错误率随着学习样本数量的增加而接近这个极限。此外，计算后验概率是通过计算观察值和结果计算得出的实际经验概率。这使得精确贝叶斯的预测完全可以解释。我们对生成的数据集和实际数据集的评估验证了我们在预测错误率和计算效率方面的理论主张。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Precise Bayes Classifier: Summary of Results

The Bayes Classifier is shown to have the minimal classification error, in addition to interpretable predictions. However, it requires the knowledge of underlying distributions of the predictors to be usable. This requirement is almost never satisfied. Naive Bayes classifiers and variants estimate this classifier by assuming the independence among predictors. This restrictive assumption hinders both the accuracy of these classifiers and their interpretability, as the calculated probabilities become less reliable. Moreover, it is argued in the literature that interpretability comes at the expense of accuracy and vice versa. In this paper, we are motivated by the accurate and interpretable nature of the Bayes Classifier. We propose Precise Bayes, which is a computationally efficient estimation of the Bayes Classifier based on a new formulation. Our method makes no assumptions, neither on independence nor on underlying distributions. We devise a new theoretical minimal error rate for our formulation and show that the error rate of Precise Bayes approaches this limit with increasing number of samples learned. Moreover, the calculated posterior probabilities, are actual empirical probabilities calculated by counting the observations and outcomes. This makes the predictions made by Precise Bayes fully explainable. Our evaluations on generated datasets and real datasets validate our theoretical claims on prediction error rate and computational efficiency.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE International Conference on Data Mining (ICDM)

自引率

0.00%

发文量