An Improved C4.5 Algorithm using Principle of Equivalent of Infinitesimal and Arithmetic Mean Best Selection Attribute for Large Dataset

L. J. Muhammad, Muhammed Besiru Jibrin, B. Yahaya, I.A. Mohammed Besiru Jibrin, Abdulkadir Ahmad, Jamila Musa Amshi
{"title":"An Improved C4.5 Algorithm using Principle of Equivalent of Infinitesimal and Arithmetic Mean Best Selection Attribute for Large Dataset","authors":"L. J. Muhammad, Muhammed Besiru Jibrin, B. Yahaya, I.A. Mohammed Besiru Jibrin, Abdulkadir Ahmad, Jamila Musa Amshi","doi":"10.1109/ICCKE50421.2020.9303622","DOIUrl":null,"url":null,"abstract":"Scaling up the data-mining classification algorithms to very large datasets has been attracting growing interest now a days. Many techniques have been employed to improve those algorithms but efficient data-mining classification algorithms that have a minimal decrease in accuracy with little increase in time complexity remain very important. The C4.5 algorithm is one of the data mining classification algorithms that have been used for uncovering hidden patterns and gleaning useful and novel knowledge in such large datasets. This work proposes a new C4.5 data mining algorithm with a lesser time complexity for large dataset compared with traditional C.45 algorithm, but however for smaller dataset traditional C.45 algorithm has lesser time complexity. The new algorithm was improved using Principle of Equivalent of Infinitesimal and Arithmetic Mean Best Selection Attribute.","PeriodicalId":402043,"journal":{"name":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE50421.2020.9303622","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Scaling up the data-mining classification algorithms to very large datasets has been attracting growing interest now a days. Many techniques have been employed to improve those algorithms but efficient data-mining classification algorithms that have a minimal decrease in accuracy with little increase in time complexity remain very important. The C4.5 algorithm is one of the data mining classification algorithms that have been used for uncovering hidden patterns and gleaning useful and novel knowledge in such large datasets. This work proposes a new C4.5 data mining algorithm with a lesser time complexity for large dataset compared with traditional C.45 algorithm, but however for smaller dataset traditional C.45 algorithm has lesser time complexity. The new algorithm was improved using Principle of Equivalent of Infinitesimal and Arithmetic Mean Best Selection Attribute.
基于无限小等价原理和算术平均最优选择属性的C4.5大数据集改进算法
如今,将数据挖掘分类算法扩展到非常大的数据集已经引起了越来越多的兴趣。已经采用了许多技术来改进这些算法,但有效的数据挖掘分类算法仍然非常重要,这些算法的准确性下降最小,时间复杂度增加很少。C4.5算法是一种数据挖掘分类算法,用于在如此大的数据集中发现隐藏的模式并收集有用的和新颖的知识。本文提出了一种新的C4.5数据挖掘算法,与传统的C.45算法相比,该算法对于大数据集具有更低的时间复杂度,但对于较小的数据集,传统的C.45算法具有更低的时间复杂度。利用无穷小等价原理和算术平均最优选择属性对新算法进行了改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信