Comparison of Algorithms for Fuzzy Decision Tree Induction

2020 18th International Conference on Emerging eLearning Technologies and Applications (ICETA) Pub Date : 2020-11-12 DOI:10.1109/ICETA51985.2020.9379189

J. Rabcan, Patrik Rusnak, J. Kostolny, R. Stankovic

{"title":"Comparison of Algorithms for Fuzzy Decision Tree Induction","authors":"J. Rabcan, Patrik Rusnak, J. Kostolny, R. Stankovic","doi":"10.1109/ICETA51985.2020.9379189","DOIUrl":null,"url":null,"abstract":"The objective of the classification is to assign a class label to a data sample based on previous, learned experience. Despite the long tradition of classification algorithms research, there is not a unique technique that yields the best classification performance in all scenarios. Many real-world problems are uncertain. In this case, the crisp classification can be difficult to perform. The usage of fuzzy logic can be useful to describe real-world problems with higher accuracy. In this paper, we describe and compare algorithms of Fuzzy Decision Trees (FDTs), which are an extension of traditional decision trees. These algorithms are popular for their easy understandability and interpretability. But today there are many algorithms for FDT induction. These algorithms differ in many aspects. The most common difference is in information measures for selecting splitting attributes. Generally, the goal of decision tree induction is to create the smallest tree which is the most accurate as possible. In this paper, we use various information measures to induct FDTs and compare the accuracy and size of obtained FDTs. In comparison, information measures that are based on a generalization of the Shannon entropy and cumulative mutual information are included. The comparison shows that FDTs based on the cumulative mutual information acquired the best results. These results will also be included in a course on data mining that is taught at Faculty of Management Science and Informatics of University of Zilina.","PeriodicalId":149716,"journal":{"name":"2020 18th International Conference on Emerging eLearning Technologies and Applications (ICETA)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 18th International Conference on Emerging eLearning Technologies and Applications (ICETA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICETA51985.2020.9379189","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

The objective of the classification is to assign a class label to a data sample based on previous, learned experience. Despite the long tradition of classification algorithms research, there is not a unique technique that yields the best classification performance in all scenarios. Many real-world problems are uncertain. In this case, the crisp classification can be difficult to perform. The usage of fuzzy logic can be useful to describe real-world problems with higher accuracy. In this paper, we describe and compare algorithms of Fuzzy Decision Trees (FDTs), which are an extension of traditional decision trees. These algorithms are popular for their easy understandability and interpretability. But today there are many algorithms for FDT induction. These algorithms differ in many aspects. The most common difference is in information measures for selecting splitting attributes. Generally, the goal of decision tree induction is to create the smallest tree which is the most accurate as possible. In this paper, we use various information measures to induct FDTs and compare the accuracy and size of obtained FDTs. In comparison, information measures that are based on a generalization of the Shannon entropy and cumulative mutual information are included. The comparison shows that FDTs based on the cumulative mutual information acquired the best results. These results will also be included in a course on data mining that is taught at Faculty of Management Science and Informatics of University of Zilina.

查看原文本刊更多论文

模糊决策树归纳算法的比较

分类的目的是根据以前学到的经验为数据样本分配一个类标签。尽管分类算法的研究有着悠久的传统，但并没有一种独特的技术可以在所有场景中产生最佳的分类性能。许多现实世界的问题都是不确定的。在这种情况下，很难执行清晰的分类。使用模糊逻辑可以更准确地描述现实世界的问题。模糊决策树是传统决策树的一种扩展，本文对模糊决策树算法进行了描述和比较。这些算法因其易于理解和可解释性而广受欢迎。但目前有许多FDT诱导算法。这些算法在许多方面有所不同。最常见的区别是选择拆分属性的信息度量。一般来说，决策树归纳法的目标是创建尽可能精确的最小树。在本文中，我们使用各种信息度量来诱导fdt，并比较得到的fdt的精度和大小。相比之下，基于香农熵的泛化和累积互信息的信息度量包括在内。对比结果表明，基于累积互信息的fdt获得了最好的结果。这些结果也将被纳入淄博管理科学与信息学院的一门数据挖掘课程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 18th International Conference on Emerging eLearning Technologies and Applications (ICETA)

自引率

0.00%

发文量