在归纳算法CART的扩展阶段引入基于关联测度的修剪：以Sidi Mohamed Ben Abdelah大学的大学前定向为例

International journal of society systems science Pub Date : 2017-08-16 DOI:10.1504/IJSSS.2017.10006643

Imane Satauri, O. Beqqali

{"title":"在归纳算法CART的扩展阶段引入基于关联测度的修剪：以Sidi Mohamed Ben Abdelah大学的大学前定向为例","authors":"Imane Satauri, O. Beqqali","doi":"10.1504/IJSSS.2017.10006643","DOIUrl":null,"url":null,"abstract":"Determining the right size of the tree is a crucial operation in the construction of a decision tree on the basis of a large volume of data. It largely determines its performance during its deployment in the population. This, in fact, considers the avoidance of two extremes: the sub-study, defined by a reduced tree, poorly capturing relevant information of the learning data; the over-learning, defined by an exaggerated size of the tree, capturing the specifics of the learning data, characteristics that can not be transposed in the population. In both cases, we have a less performing prediction model. This paper presents an approach of indirect pre-pruning introduced within the algorithm classification and regression tree (CART) expansion phase; it is based on the rules generated from the decision tree and uses validation criteria inspired from the data mining techniques to discover association rules.","PeriodicalId":89681,"journal":{"name":"International journal of society systems science","volume":"9 1","pages":"165"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Introduction of pruning based on measures of association in the expansion phase of the induction algorithm CART: a case study of pre-university orientation for Sidi Mohamed Ben Abdelah University\",\"authors\":\"Imane Satauri, O. Beqqali\",\"doi\":\"10.1504/IJSSS.2017.10006643\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Determining the right size of the tree is a crucial operation in the construction of a decision tree on the basis of a large volume of data. It largely determines its performance during its deployment in the population. This, in fact, considers the avoidance of two extremes: the sub-study, defined by a reduced tree, poorly capturing relevant information of the learning data; the over-learning, defined by an exaggerated size of the tree, capturing the specifics of the learning data, characteristics that can not be transposed in the population. In both cases, we have a less performing prediction model. This paper presents an approach of indirect pre-pruning introduced within the algorithm classification and regression tree (CART) expansion phase; it is based on the rules generated from the decision tree and uses validation criteria inspired from the data mining techniques to discover association rules.\",\"PeriodicalId\":89681,\"journal\":{\"name\":\"International journal of society systems science\",\"volume\":\"9 1\",\"pages\":\"165\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of society systems science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/IJSSS.2017.10006643\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of society systems science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJSSS.2017.10006643","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在基于大量数据构建决策树的过程中，确定树的正确大小是一项至关重要的操作。它在很大程度上决定了它在人群中部署期间的性能。事实上，这考虑了避免两个极端：子研究，由简化树定义，未能很好地捕捉到学习数据的相关信息；过度学习，由夸大的树大小定义，捕捉学习数据的细节，这些特征不能在总体中转换。在这两种情况下，我们都有一个性能较差的预测模型。本文提出了一种在算法分类和回归树（CART）扩展阶段引入的间接预修剪方法；它基于从决策树生成的规则，并使用受数据挖掘技术启发的验证标准来发现关联规则。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Introduction of pruning based on measures of association in the expansion phase of the induction algorithm CART: a case study of pre-university orientation for Sidi Mohamed Ben Abdelah University

Determining the right size of the tree is a crucial operation in the construction of a decision tree on the basis of a large volume of data. It largely determines its performance during its deployment in the population. This, in fact, considers the avoidance of two extremes: the sub-study, defined by a reduced tree, poorly capturing relevant information of the learning data; the over-learning, defined by an exaggerated size of the tree, capturing the specifics of the learning data, characteristics that can not be transposed in the population. In both cases, we have a less performing prediction model. This paper presents an approach of indirect pre-pruning introduced within the algorithm classification and regression tree (CART) expansion phase; it is based on the rules generated from the decision tree and uses validation criteria inspired from the data mining techniques to discover association rules.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International journal of society systems science

自引率

0.00%

发文量