L. J. Muhammad, Muhammed Besiru Jibrin, B. Yahaya, I.A. Mohammed Besiru Jibrin, Abdulkadir Ahmad, Jamila Musa Amshi
{"title":"An Improved C4.5 Algorithm using Principle of Equivalent of Infinitesimal and Arithmetic Mean Best Selection Attribute for Large Dataset","authors":"L. J. Muhammad, Muhammed Besiru Jibrin, B. Yahaya, I.A. Mohammed Besiru Jibrin, Abdulkadir Ahmad, Jamila Musa Amshi","doi":"10.1109/ICCKE50421.2020.9303622","DOIUrl":null,"url":null,"abstract":"Scaling up the data-mining classification algorithms to very large datasets has been attracting growing interest now a days. Many techniques have been employed to improve those algorithms but efficient data-mining classification algorithms that have a minimal decrease in accuracy with little increase in time complexity remain very important. The C4.5 algorithm is one of the data mining classification algorithms that have been used for uncovering hidden patterns and gleaning useful and novel knowledge in such large datasets. This work proposes a new C4.5 data mining algorithm with a lesser time complexity for large dataset compared with traditional C.45 algorithm, but however for smaller dataset traditional C.45 algorithm has lesser time complexity. The new algorithm was improved using Principle of Equivalent of Infinitesimal and Arithmetic Mean Best Selection Attribute.","PeriodicalId":402043,"journal":{"name":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE50421.2020.9303622","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Scaling up the data-mining classification algorithms to very large datasets has been attracting growing interest now a days. Many techniques have been employed to improve those algorithms but efficient data-mining classification algorithms that have a minimal decrease in accuracy with little increase in time complexity remain very important. The C4.5 algorithm is one of the data mining classification algorithms that have been used for uncovering hidden patterns and gleaning useful and novel knowledge in such large datasets. This work proposes a new C4.5 data mining algorithm with a lesser time complexity for large dataset compared with traditional C.45 algorithm, but however for smaller dataset traditional C.45 algorithm has lesser time complexity. The new algorithm was improved using Principle of Equivalent of Infinitesimal and Arithmetic Mean Best Selection Attribute.