Enhancement of Very Fast Decision Tree for Data Stream Mining

IF 1.1 4区计算机科学 Q4 AUTOMATION & CONTROL SYSTEMS

Studies in Informatics and Control Pub Date : 2022-06-30 DOI:10.24846/v31i2y202205

Mai Lefa, Hatem Abd-Elkader, Rashed K. Salem

{"title":"Enhancement of Very Fast Decision Tree for Data Stream Mining","authors":"Mai Lefa, Hatem Abd-Elkader, Rashed K. Salem","doi":"10.24846/v31i2y202205","DOIUrl":null,"url":null,"abstract":": Traditional machine learning (ML) algorithms use static datasets to model knowledge. Nowadays, there is an increasing demand for machine learning based solutions that can handle very huge amounts of data in the shape of streams that never stop. The Very Fast Decision Tree (VFDT) is one of the most widely utilized data stream mining algorithms (DSM), despite the fact that it wastes a huge amount of energy on trivial calculations. The machine learning community has come first in terms of accuracy and execution time while designing algorithms like this. When assessing data mining algorithms, numerous types of studies include energy usage as a crucial factor. The purpose of this research is to create a hyper model to optimize the VFDT algorithm, which reduces the waste of energy while maintaining accuracy. In the proposed method, some fixed algorithm parameters were changed to dynamic parameters after analyzing each of them separately and knowing the extent of their positive impact on reducing energy consumption in several cases in algorithm. The practical experiment was conducted on both the algorithm in its basic form and the algorithm in the proposed form on several different types of datasets in the same application environment The main advantage of the results of the proposed method compared to the results of the basic algorithm is that there was a noticeable development in the performance of the algorithm in terms of reducing its energy consumption and maintaining its accuracy levels.","PeriodicalId":49466,"journal":{"name":"Studies in Informatics and Control","volume":" ","pages":""},"PeriodicalIF":1.1000,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in Informatics and Control","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.24846/v31i2y202205","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

: Traditional machine learning (ML) algorithms use static datasets to model knowledge. Nowadays, there is an increasing demand for machine learning based solutions that can handle very huge amounts of data in the shape of streams that never stop. The Very Fast Decision Tree (VFDT) is one of the most widely utilized data stream mining algorithms (DSM), despite the fact that it wastes a huge amount of energy on trivial calculations. The machine learning community has come first in terms of accuracy and execution time while designing algorithms like this. When assessing data mining algorithms, numerous types of studies include energy usage as a crucial factor. The purpose of this research is to create a hyper model to optimize the VFDT algorithm, which reduces the waste of energy while maintaining accuracy. In the proposed method, some fixed algorithm parameters were changed to dynamic parameters after analyzing each of them separately and knowing the extent of their positive impact on reducing energy consumption in several cases in algorithm. The practical experiment was conducted on both the algorithm in its basic form and the algorithm in the proposed form on several different types of datasets in the same application environment The main advantage of the results of the proposed method compared to the results of the basic algorithm is that there was a noticeable development in the performance of the algorithm in terms of reducing its energy consumption and maintaining its accuracy levels.

查看原文本刊更多论文

数据流挖掘中快速决策树的增强

传统的机器学习(ML)算法使用静态数据集来建模知识。如今，人们对基于机器学习的解决方案的需求越来越大，这些解决方案可以处理大量永不停止的数据流。快速决策树(VFDT)是应用最广泛的数据流挖掘算法之一，尽管它在琐碎的计算上浪费了大量的能量。在设计这样的算法时，机器学习社区在准确性和执行时间方面处于领先地位。在评估数据挖掘算法时，许多类型的研究都将能源使用作为一个关键因素。本研究的目的是建立一个超模型来优化VFDT算法，在保持精度的同时减少能量的浪费。在本文提出的方法中，通过对固定的算法参数进行单独分析，并了解其在算法中的几种情况下对降低能耗的积极影响程度后，将其改为动态参数。在相同的应用环境下，对基本形式的算法和本文提出的算法在几种不同类型的数据集上进行了实际实验。与基本算法的结果相比，本文提出的算法的主要优点是，在降低算法能耗和保持算法精度水平方面，算法的性能有了明显的提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Studies in Informatics and Control AUTOMATION & CONTROL SYSTEMS-OPERATIONS RESEARCH & MANAGEMENT SCIENCE

CiteScore

2.70

自引率

25.00%

发文量

审稿时长

>12 weeks

期刊介绍： Studies in Informatics and Control journal provides important perspectives on topics relevant to Information Technology, with an emphasis on useful applications in the most important areas of IT. This journal is aimed at advanced practitioners and researchers in the field of IT and welcomes original contributions from scholars and professionals worldwide. SIC is published both in print and online by the National Institute for R&D in Informatics, ICI Bucharest. Abstracts, full text and graphics of all articles in the online version of SIC are identical to the print version of the Journal.