Xinyu Cui , Changzhong Wang , Shuang An , Yuhua Qian
{"title":"Adaptive fuzzy neighborhood decision tree","authors":"Xinyu Cui , Changzhong Wang , Shuang An , Yuhua Qian","doi":"10.1016/j.asoc.2024.112435","DOIUrl":null,"url":null,"abstract":"<div><div>Decision tree algorithms have gained widespread acceptance in machine learning, with the central challenge lying in devising an optimal splitting strategy for node sample subspaces. In the context of continuous data, conventional approaches typically involve fuzzifying data or adopting a dichotomous scheme akin to the CART tree. Nevertheless, fuzzifying continuous features often entails information loss, whereas the dichotomous approach can generate an excessive number of classification rules, potentially leading to overfitting. To address these limitations, this study introduces an adaptive growth decision tree framework, termed the fuzzy neighborhood decision tree (FNDT). Initially, we establish a fuzzy neighborhood decision model by leveraging the concept of fuzzy inclusion degree. Furthermore, we delve into the topological structure of misclassified samples under the proposed decision model, providing a theoretical foundation for the construction of FNDT. Subsequently, we utilize conditional information entropy to sift through original features, prioritizing those that offer the maximum information gain for decision tree nodes. By leveraging the conditional decision partitions derived from the fuzzy neighborhood decision model, we achieve an adaptive splitting method for optimal features, culminating in an adaptive growth decision tree algorithm that relies solely on the inherent structure of real-valued data. Experimental evaluations reveal that, compared with advanced decision tree algorithms, FNDT exhibits a simple tree structure, stronger generalization capabilities, and superior performance in classifying continuous data.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"167 ","pages":"Article 112435"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494624012092","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Decision tree algorithms have gained widespread acceptance in machine learning, with the central challenge lying in devising an optimal splitting strategy for node sample subspaces. In the context of continuous data, conventional approaches typically involve fuzzifying data or adopting a dichotomous scheme akin to the CART tree. Nevertheless, fuzzifying continuous features often entails information loss, whereas the dichotomous approach can generate an excessive number of classification rules, potentially leading to overfitting. To address these limitations, this study introduces an adaptive growth decision tree framework, termed the fuzzy neighborhood decision tree (FNDT). Initially, we establish a fuzzy neighborhood decision model by leveraging the concept of fuzzy inclusion degree. Furthermore, we delve into the topological structure of misclassified samples under the proposed decision model, providing a theoretical foundation for the construction of FNDT. Subsequently, we utilize conditional information entropy to sift through original features, prioritizing those that offer the maximum information gain for decision tree nodes. By leveraging the conditional decision partitions derived from the fuzzy neighborhood decision model, we achieve an adaptive splitting method for optimal features, culminating in an adaptive growth decision tree algorithm that relies solely on the inherent structure of real-valued data. Experimental evaluations reveal that, compared with advanced decision tree algorithms, FNDT exhibits a simple tree structure, stronger generalization capabilities, and superior performance in classifying continuous data.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.