一种基于变精度粗糙集和z熵的多粒度决策树算法

IF 6.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing Pub Date : 2025-09-11 DOI:10.1016/j.asoc.2025.113851

Hui Dong , Caihui Liu , Xiying Chen , Duoqian Miao

{"title":"一种基于变精度粗糙集和z熵的多粒度决策树算法","authors":"Hui Dong , Caihui Liu , Xiying Chen , Duoqian Miao","doi":"10.1016/j.asoc.2025.113851","DOIUrl":null,"url":null,"abstract":"<div><div>The existing decision tree algorithms often use a single-layer measure to process data, which cannot fully consider the complex interactions and dependencies between different granularity levels. In addition, decision tree algorithms inevitably face the issue of multi-value preference, which may lead to the selection of unreasonable attributes in the process of partition, thereby affecting the performance of the algorithms. Therefore, this paper proposes an improved decision tree algorithm, called Ze-VNDT, which combines variable precision rough sets with Zentropy. First, to avoid the information loss caused by data discretization, this paper introduces variable precision neighborhood rough sets for data processing. Second, by analyzing the granularity level structure within the variable precision neighborhood rough set model, knowledge uncertainty is analyzed from three granularity levels: decision classes, approximate relations, and similarity classes. We describe the uncertain knowledge from the overall to the internal using the idea of going from coarse to fine, and design a Zentropy to measure uncertainty. To address the issue of multi-value preference, an adaptive weighted Zentropy uncertainty measure is designed based on the definition of uncertainty measure based on Zentropy. Third, when constructing the improved decision tree algorithm, the optimal attributes are selected based on the designed uncertainty measure. Finally, numerical experiments on 18 UCI datasets validated the effectiveness and rationality of the proposed algorithm. The experimental results showed that, compared to traditional algorithms and the latest improved algorithms, the proposed algorithm achieved an average accuracy of 94.79%, an average precision of 85.77%, an average recall rate of 84.68%, and an F1-score of 84.97% across the 18 datasets. It ranked first in all five evaluation metrics, demonstrating higher stability and accuracy.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"185 ","pages":"Article 113851"},"PeriodicalIF":6.6000,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A multi-granularity decision tree algorithm based on variable precision rough sets and Zentropy\",\"authors\":\"Hui Dong , Caihui Liu , Xiying Chen , Duoqian Miao\",\"doi\":\"10.1016/j.asoc.2025.113851\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The existing decision tree algorithms often use a single-layer measure to process data, which cannot fully consider the complex interactions and dependencies between different granularity levels. In addition, decision tree algorithms inevitably face the issue of multi-value preference, which may lead to the selection of unreasonable attributes in the process of partition, thereby affecting the performance of the algorithms. Therefore, this paper proposes an improved decision tree algorithm, called Ze-VNDT, which combines variable precision rough sets with Zentropy. First, to avoid the information loss caused by data discretization, this paper introduces variable precision neighborhood rough sets for data processing. Second, by analyzing the granularity level structure within the variable precision neighborhood rough set model, knowledge uncertainty is analyzed from three granularity levels: decision classes, approximate relations, and similarity classes. We describe the uncertain knowledge from the overall to the internal using the idea of going from coarse to fine, and design a Zentropy to measure uncertainty. To address the issue of multi-value preference, an adaptive weighted Zentropy uncertainty measure is designed based on the definition of uncertainty measure based on Zentropy. Third, when constructing the improved decision tree algorithm, the optimal attributes are selected based on the designed uncertainty measure. Finally, numerical experiments on 18 UCI datasets validated the effectiveness and rationality of the proposed algorithm. The experimental results showed that, compared to traditional algorithms and the latest improved algorithms, the proposed algorithm achieved an average accuracy of 94.79%, an average precision of 85.77%, an average recall rate of 84.68%, and an F1-score of 84.97% across the 18 datasets. It ranked first in all five evaluation metrics, demonstrating higher stability and accuracy.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"185 \",\"pages\":\"Article 113851\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2025-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625011640\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625011640","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

现有的决策树算法往往采用单层度量来处理数据，不能充分考虑不同粒度级别之间复杂的相互作用和依赖关系。此外，决策树算法不可避免地面临多值偏好问题，这可能导致在划分过程中选择不合理的属性，从而影响算法的性能。因此，本文提出了一种将变精度粗糙集与Zentropy相结合的改进决策树算法Ze-VNDT。首先，为了避免数据离散化带来的信息丢失，引入变精度邻域粗糙集对数据进行处理。其次，通过分析变精度邻域粗糙集模型的粒度层次结构，从决策类、近似关系类和相似类三个粒度层次分析知识的不确定性；采用由粗到细的思想对不确定性知识从整体到内部进行描述，并设计了一个Zentropy来度量不确定性。针对多值偏好问题，在基于z熵的不确定性测度定义的基础上，设计了自适应加权z熵不确定性测度。第三，在构建改进决策树算法时，根据设计的不确定性度量选择最优属性。最后，在18个UCI数据集上进行了数值实验，验证了该算法的有效性和合理性。实验结果表明，与传统算法和最新改进算法相比，本文算法在18个数据集上的平均准确率为94.79%，平均精密度为85.77%，平均召回率为84.68%，f1分数为84.97%。它在所有五个评估指标中排名第一，显示出更高的稳定性和准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A multi-granularity decision tree algorithm based on variable precision rough sets and Zentropy

The existing decision tree algorithms often use a single-layer measure to process data, which cannot fully consider the complex interactions and dependencies between different granularity levels. In addition, decision tree algorithms inevitably face the issue of multi-value preference, which may lead to the selection of unreasonable attributes in the process of partition, thereby affecting the performance of the algorithms. Therefore, this paper proposes an improved decision tree algorithm, called Ze-VNDT, which combines variable precision rough sets with Zentropy. First, to avoid the information loss caused by data discretization, this paper introduces variable precision neighborhood rough sets for data processing. Second, by analyzing the granularity level structure within the variable precision neighborhood rough set model, knowledge uncertainty is analyzed from three granularity levels: decision classes, approximate relations, and similarity classes. We describe the uncertain knowledge from the overall to the internal using the idea of going from coarse to fine, and design a Zentropy to measure uncertainty. To address the issue of multi-value preference, an adaptive weighted Zentropy uncertainty measure is designed based on the definition of uncertainty measure based on Zentropy. Third, when constructing the improved decision tree algorithm, the optimal attributes are selected based on the designed uncertainty measure. Finally, numerical experiments on 18 UCI datasets validated the effectiveness and rationality of the proposed algorithm. The experimental results showed that, compared to traditional algorithms and the latest improved algorithms, the proposed algorithm achieved an average accuracy of 94.79%, an average precision of 85.77%, an average recall rate of 84.68%, and an F1-score of 84.97% across the 18 datasets. It ranked first in all five evaluation metrics, demonstrating higher stability and accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.