{"title":"Learning acyclic decision trees with Functional Dependency Network and MDL Genetic Programming","authors":"Wing-Ho Shum, K. Leung, M. Wong","doi":"10.1109/ICCGI.2006.41","DOIUrl":null,"url":null,"abstract":"One objective of data mining is to discover parent-child relationships among a set of variables in the domain. Moreover, showing parents' importance can further help to improve decision makings' quality. Bayesian network (BN) is a useful model for multi-class problems and can illustrate parent-child relationships with no cycle. But it cannot show parents' importance. In contrast, decision trees state parents' importance clearly, for instance, the most important parent is put in the first level. However, decision trees are proposed for single-class problems only, when they are applied to multi-class ones, they are likely to produce cycles representing tautologic. In this paper, we propose to use MDL genetic programming (MDLGP) and functional dependency network (FDN) to learn a set of acyclic decision trees (Shum et al., 2005). The FDN is an extension of BN; it can handle all of discrete, continuous, interval and ordinal values; it guarantees to produce decision trees with no cycle; its learning search space is smaller than decision trees'; and it can represent higher-order relationships among variables. The MDLGP is a robust genetic programming (GP) proposed to learn the FDN. We also propose a method to derive acyclic decision trees from the FDN. The experimental results demonstrate that the proposed method can successfully discover the target decision trees, which have no cycle and have the accurate classification results","PeriodicalId":112974,"journal":{"name":"2006 International Multi-Conference on Computing in the Global Information Technology - (ICCGI'06)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 International Multi-Conference on Computing in the Global Information Technology - (ICCGI'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCGI.2006.41","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
One objective of data mining is to discover parent-child relationships among a set of variables in the domain. Moreover, showing parents' importance can further help to improve decision makings' quality. Bayesian network (BN) is a useful model for multi-class problems and can illustrate parent-child relationships with no cycle. But it cannot show parents' importance. In contrast, decision trees state parents' importance clearly, for instance, the most important parent is put in the first level. However, decision trees are proposed for single-class problems only, when they are applied to multi-class ones, they are likely to produce cycles representing tautologic. In this paper, we propose to use MDL genetic programming (MDLGP) and functional dependency network (FDN) to learn a set of acyclic decision trees (Shum et al., 2005). The FDN is an extension of BN; it can handle all of discrete, continuous, interval and ordinal values; it guarantees to produce decision trees with no cycle; its learning search space is smaller than decision trees'; and it can represent higher-order relationships among variables. The MDLGP is a robust genetic programming (GP) proposed to learn the FDN. We also propose a method to derive acyclic decision trees from the FDN. The experimental results demonstrate that the proposed method can successfully discover the target decision trees, which have no cycle and have the accurate classification results
数据挖掘的一个目标是发现领域中一组变量之间的父子关系。此外,显示父母的重要性可以进一步帮助提高决策质量。贝叶斯网络(BN)是解决多类问题的有效模型,它可以描述无循环的亲子关系。但它不能显示父母的重要性。相比之下,决策树明确地表明了父母的重要性,例如,最重要的父母被放在第一层。然而,决策树只针对单类问题提出,当它们应用于多类问题时,它们很可能产生表示重言式的循环。在本文中,我们提出使用MDL遗传规划(MDLGP)和功能依赖网络(FDN)来学习一组无循环决策树(Shum et al., 2005)。FDN是BN的扩展;它可以处理所有的离散、连续、区间和序数值;它保证生成无周期的决策树;其学习搜索空间小于决策树;它可以表示变量之间的高阶关系。MDLGP是一种鲁棒遗传规划(GP),用于学习FDN。我们还提出了一种从FDN推导非循环决策树的方法。实验结果表明,该方法能够成功地发现目标决策树,且目标决策树不存在循环,分类结果准确