{"title":"学习懒惰朴素贝叶斯分类器排序","authors":"Liangxiao Jiang, Yuanyuan Guo","doi":"10.1109/ICTAI.2005.80","DOIUrl":null,"url":null,"abstract":"Naive Bayes (simply NB) has been well-known as an effective and efficient classification algorithm. However, it is based on the conditional independence assumption that it is often violated in applications. In addition, in many real-world data mining applications, however, an accurate ranking of instances is often required rather than an accurate classification. For example, a ranking of customers in terms of the likelihood that they buy one's products is useful in direct marketing. In this paper, we firstly investigate the ranking performance of some lazy learning algorithms for extending naive Bayes. The ranking performance is measured by Hand and Till (2001) and Bradley (1997). We observe that they can not significantly improve naive Bayes' ranking performance. Motivated by this fact and aiming at improving naive Bayes with accurate ranking, we present a new lazy learning algorithm, called lazy naive Bayes (simply LNB), to extend naive Bayes for ranking. We experimentally tested our algorithm, using the whole 36 UCI data sets (Blake and Merz, 2000) recommended by Weka, and compared it to NB and C4.4 (Provost and Domingos, 2003) measured by AUC. The experimental results show that our algorithm significantly outperforms both NB and C4.4","PeriodicalId":294694,"journal":{"name":"17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Learning lazy naive Bayesian classifiers for ranking\",\"authors\":\"Liangxiao Jiang, Yuanyuan Guo\",\"doi\":\"10.1109/ICTAI.2005.80\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Naive Bayes (simply NB) has been well-known as an effective and efficient classification algorithm. However, it is based on the conditional independence assumption that it is often violated in applications. In addition, in many real-world data mining applications, however, an accurate ranking of instances is often required rather than an accurate classification. For example, a ranking of customers in terms of the likelihood that they buy one's products is useful in direct marketing. In this paper, we firstly investigate the ranking performance of some lazy learning algorithms for extending naive Bayes. The ranking performance is measured by Hand and Till (2001) and Bradley (1997). We observe that they can not significantly improve naive Bayes' ranking performance. Motivated by this fact and aiming at improving naive Bayes with accurate ranking, we present a new lazy learning algorithm, called lazy naive Bayes (simply LNB), to extend naive Bayes for ranking. We experimentally tested our algorithm, using the whole 36 UCI data sets (Blake and Merz, 2000) recommended by Weka, and compared it to NB and C4.4 (Provost and Domingos, 2003) measured by AUC. The experimental results show that our algorithm significantly outperforms both NB and C4.4\",\"PeriodicalId\":294694,\"journal\":{\"name\":\"17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05)\",\"volume\":\"101 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTAI.2005.80\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2005.80","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23
摘要
朴素贝叶斯(Naive Bayes,简称NB)是一种高效的分类算法。然而,它是基于条件独立性假设的,因此在应用程序中经常被违反。此外,在许多实际的数据挖掘应用程序中,通常需要对实例进行准确的排序,而不是进行准确的分类。例如,根据客户购买某产品的可能性对其进行排名,这在直接营销中很有用。在本文中,我们首先研究了一些扩展朴素贝叶斯的惰性学习算法的排名性能。排名表现由Hand and Till(2001)和Bradley(1997)衡量。我们观察到它们不能显著提高朴素贝叶斯的排序性能。基于这一事实,我们提出了一种新的懒惰学习算法,称为懒惰朴素贝叶斯(lazy naive Bayes,简称LNB),以扩展朴素贝叶斯的排序能力。我们使用Weka推荐的全部36个UCI数据集(Blake and Merz, 2000)对我们的算法进行了实验测试,并将其与AUC测量的NB和C4.4 (Provost and Domingos, 2003)进行了比较。实验结果表明,我们的算法明显优于NB和C4.4
Learning lazy naive Bayesian classifiers for ranking
Naive Bayes (simply NB) has been well-known as an effective and efficient classification algorithm. However, it is based on the conditional independence assumption that it is often violated in applications. In addition, in many real-world data mining applications, however, an accurate ranking of instances is often required rather than an accurate classification. For example, a ranking of customers in terms of the likelihood that they buy one's products is useful in direct marketing. In this paper, we firstly investigate the ranking performance of some lazy learning algorithms for extending naive Bayes. The ranking performance is measured by Hand and Till (2001) and Bradley (1997). We observe that they can not significantly improve naive Bayes' ranking performance. Motivated by this fact and aiming at improving naive Bayes with accurate ranking, we present a new lazy learning algorithm, called lazy naive Bayes (simply LNB), to extend naive Bayes for ranking. We experimentally tested our algorithm, using the whole 36 UCI data sets (Blake and Merz, 2000) recommended by Weka, and compared it to NB and C4.4 (Provost and Domingos, 2003) measured by AUC. The experimental results show that our algorithm significantly outperforms both NB and C4.4