{"title":"On a capacity control using Boolean kernels for the learning of Boolean functions","authors":"Ken Sadohara","doi":"10.1109/ICDM.2002.1183934","DOIUrl":null,"url":null,"abstract":"This paper concerns the classification task in discrete attribute spaces, but considers the task in a more fundamental framework: the learning of Boolean functions. The purpose of this paper is to present a new learning algorithm for Boolean functions called Boolean kernel classifier (BKC) employing capacity control using Boolean kernels. BKC uses support vector machines (SVMs) as learning engines and Boolean kernels are primarily used for running SVMs in feature spaces spanned by conjunctions of Boolean literals. However, another important role of Boolean kernels is to appropriately control the size of its hypothesis space, to avoid overfitting. After applying a SVM to learn a classifier f in a feature space H induced by a Boolean kernel, BKC uses another Boolean kernel to compute the projections f/sup k/ of f onto a subspace H/sub k/ of H spanned by conjunctions with length at most k. By evaluating the accuracy of f/sup k/ on training data for any k, BKC can determine the smallest k such that f/sup k/ is as accurate as f and learn another f' in H/sub k/ expected to have lower error for unseen data. By an empirical study on learning of randomly generated Boolean functions, it is shown that the capacity control is effective, and BKC outperforms C4.5 and naive Bayes classifiers.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2002.1183934","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
This paper concerns the classification task in discrete attribute spaces, but considers the task in a more fundamental framework: the learning of Boolean functions. The purpose of this paper is to present a new learning algorithm for Boolean functions called Boolean kernel classifier (BKC) employing capacity control using Boolean kernels. BKC uses support vector machines (SVMs) as learning engines and Boolean kernels are primarily used for running SVMs in feature spaces spanned by conjunctions of Boolean literals. However, another important role of Boolean kernels is to appropriately control the size of its hypothesis space, to avoid overfitting. After applying a SVM to learn a classifier f in a feature space H induced by a Boolean kernel, BKC uses another Boolean kernel to compute the projections f/sup k/ of f onto a subspace H/sub k/ of H spanned by conjunctions with length at most k. By evaluating the accuracy of f/sup k/ on training data for any k, BKC can determine the smallest k such that f/sup k/ is as accurate as f and learn another f' in H/sub k/ expected to have lower error for unseen data. By an empirical study on learning of randomly generated Boolean functions, it is shown that the capacity control is effective, and BKC outperforms C4.5 and naive Bayes classifiers.
本文关注离散属性空间中的分类任务,但在一个更基本的框架中考虑这个任务:布尔函数的学习。本文提出了一种基于布尔核容量控制的布尔函数学习算法——布尔核分类器(BKC)。BKC使用支持向量机(svm)作为学习引擎,布尔核主要用于在由布尔字面值连词组成的特征空间中运行svm。然而,布尔核的另一个重要作用是适当地控制其假设空间的大小,以避免过拟合。后应用SVM学习分类器特征空间H f诱导一个布尔内核,1使用另一个布尔内核计算预测/一口k / f到一个子空间H / sub k / H与长度最多由连词张成k。通过评估的准确性f / k /对训练数据的任何k一同晚餐,1可以确定最小的k, f /一口k / f和学习一样精确的另一个f ' H / sub k /将看不见的数据误差较低。通过对随机生成布尔函数学习的实证研究,表明容量控制是有效的,BKC分类器优于C4.5和朴素贝叶斯分类器。