{"title":"通过顺序查找数据并使用纠错规则归纳决策树","authors":"N. Bathaeian, Muharram Mansoorizadeh","doi":"10.1109/IKT.2016.7777780","DOIUrl":null,"url":null,"abstract":"Decision trees are common algorithms in machine learning. Traditionally, these algorithms make trees recursively and at each step, they inspect data to induce the part of the tree. However decision trees are famous for their instability and high variance in error. In this paper a solution which adds error correction rule to a traditional decision tree algorithm is examined. In fact an algorithm which we call it, ECD3 is introduced. Algorithm of ECD3 inspects data sequentially in an iterative manner and updates tree only when it finds an erroneous observation. This method was first proposed by Dr. Utgoff but not implemented. In this paper, the method is developed and several experiments are performed to evaluate the method. We found that in most cases, performance of ECD3 is comparable to its predecessors. However ECD3 has some benefits over them. First, sizes of its trees are significantly smaller. Second, on average, variance of error in ECD3 is lower. Furthermore, ECD3 automatically chooses part of data for induction of the tree and sets aside others. This capability can be exploited for prototype selection in various learning algorithms. To explain these observations, we use inductive bias and margin definitions in our theories. We introduce a new definition of margin in ordinary decision trees based on shape, size and splitting criteria in trees. We show that how ECD3 expands the margins and enhances precision over test data.","PeriodicalId":205496,"journal":{"name":"2016 Eighth International Conference on Information and Knowledge Technology (IKT)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Induction of decision trees by looking to data sequentially and using error correction rule\",\"authors\":\"N. Bathaeian, Muharram Mansoorizadeh\",\"doi\":\"10.1109/IKT.2016.7777780\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Decision trees are common algorithms in machine learning. Traditionally, these algorithms make trees recursively and at each step, they inspect data to induce the part of the tree. However decision trees are famous for their instability and high variance in error. In this paper a solution which adds error correction rule to a traditional decision tree algorithm is examined. In fact an algorithm which we call it, ECD3 is introduced. Algorithm of ECD3 inspects data sequentially in an iterative manner and updates tree only when it finds an erroneous observation. This method was first proposed by Dr. Utgoff but not implemented. In this paper, the method is developed and several experiments are performed to evaluate the method. We found that in most cases, performance of ECD3 is comparable to its predecessors. However ECD3 has some benefits over them. First, sizes of its trees are significantly smaller. Second, on average, variance of error in ECD3 is lower. Furthermore, ECD3 automatically chooses part of data for induction of the tree and sets aside others. This capability can be exploited for prototype selection in various learning algorithms. To explain these observations, we use inductive bias and margin definitions in our theories. We introduce a new definition of margin in ordinary decision trees based on shape, size and splitting criteria in trees. We show that how ECD3 expands the margins and enhances precision over test data.\",\"PeriodicalId\":205496,\"journal\":{\"name\":\"2016 Eighth International Conference on Information and Knowledge Technology (IKT)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Eighth International Conference on Information and Knowledge Technology (IKT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IKT.2016.7777780\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Eighth International Conference on Information and Knowledge Technology (IKT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IKT.2016.7777780","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Induction of decision trees by looking to data sequentially and using error correction rule
Decision trees are common algorithms in machine learning. Traditionally, these algorithms make trees recursively and at each step, they inspect data to induce the part of the tree. However decision trees are famous for their instability and high variance in error. In this paper a solution which adds error correction rule to a traditional decision tree algorithm is examined. In fact an algorithm which we call it, ECD3 is introduced. Algorithm of ECD3 inspects data sequentially in an iterative manner and updates tree only when it finds an erroneous observation. This method was first proposed by Dr. Utgoff but not implemented. In this paper, the method is developed and several experiments are performed to evaluate the method. We found that in most cases, performance of ECD3 is comparable to its predecessors. However ECD3 has some benefits over them. First, sizes of its trees are significantly smaller. Second, on average, variance of error in ECD3 is lower. Furthermore, ECD3 automatically chooses part of data for induction of the tree and sets aside others. This capability can be exploited for prototype selection in various learning algorithms. To explain these observations, we use inductive bias and margin definitions in our theories. We introduce a new definition of margin in ordinary decision trees based on shape, size and splitting criteria in trees. We show that how ECD3 expands the margins and enhances precision over test data.