A bottom-up oblique decision tree induction algorithm

2011 11th International Conference on Intelligent Systems Design and Applications Pub Date : 2011-11-01 DOI:10.1109/ISDA.2011.6121697

Rodrigo C. Barros, R. Cerri, P. Jaskowiak, A. Carvalho

{"title":"A bottom-up oblique decision tree induction algorithm","authors":"Rodrigo C. Barros, R. Cerri, P. Jaskowiak, A. Carvalho","doi":"10.1109/ISDA.2011.6121697","DOIUrl":null,"url":null,"abstract":"Decision tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes, whereas univariate trees can only perform axis-parallel splits. The majority of the oblique and univariate decision tree induction algorithms perform a top-down strategy for growing the tree, relying on an impurity-based measure for splitting nodes. In this paper, we propose a novel bottom-up algorithm for inducing oblique trees named BUTIA. It does not require an impurity-measure for dividing nodes, since we know a priori the data resulting from each split. For generating the splitting hyperplanes, our algorithm implements a support vector machine solution, and a clustering algorithm is used for generating the initial leaves. We compare BUTIA to traditional univariate and oblique decision tree algorithms, C4.5, CART, OC1 and FT, as well as to a standard SVM implementation, using real gene expression benchmark data. Experimental results show the effectiveness of the proposed approach in several cases.","PeriodicalId":433207,"journal":{"name":"2011 11th International Conference on Intelligent Systems Design and Applications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 11th International Conference on Intelligent Systems Design and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISDA.2011.6121697","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 26

Abstract

Decision tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes, whereas univariate trees can only perform axis-parallel splits. The majority of the oblique and univariate decision tree induction algorithms perform a top-down strategy for growing the tree, relying on an impurity-based measure for splitting nodes. In this paper, we propose a novel bottom-up algorithm for inducing oblique trees named BUTIA. It does not require an impurity-measure for dividing nodes, since we know a priori the data resulting from each split. For generating the splitting hyperplanes, our algorithm implements a support vector machine solution, and a clustering algorithm is used for generating the initial leaves. We compare BUTIA to traditional univariate and oblique decision tree algorithms, C4.5, CART, OC1 and FT, as well as to a standard SVM implementation, using real gene expression benchmark data. Experimental results show the effectiveness of the proposed approach in several cases.

查看原文本刊更多论文

一种自底向上倾斜决策树归纳算法

决策树归纳算法在知识发现和数据挖掘中有着广泛的应用，特别是在需要模型可理解性的场景中。传统的单变量方法的一种变体是所谓的倾斜决策树，它允许在其非终端节点上进行多变量测试。倾斜决策树可以对倾斜于属性轴的决策边界建模，而单变量树只能执行轴平行分割。大多数倾斜和单变量决策树归纳算法执行自顶向下的策略来生长树，依赖于基于杂质的度量来分裂节点。本文提出了一种自底向上的斜树诱导算法BUTIA。它不需要杂质度量来划分节点，因为我们先验地知道每次划分所产生的数据。为了生成分裂超平面，我们的算法实现了支持向量机解决方案，并使用聚类算法生成初始叶。我们将BUTIA与传统的单变量和倾斜决策树算法(C4.5, CART, OC1和FT)以及使用真实基因表达基准数据的标准SVM实现进行比较。实验结果表明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 11th International Conference on Intelligent Systems Design and Applications

自引率

0.00%

发文量