The Improvement of C4.5 Algorithm Accuracy in Predicting Forest Fires Using Discretization and AdaBoost

Tomi Bagus Nugroho, E. Sugiharti
{"title":"The Improvement of C4.5 Algorithm Accuracy in Predicting Forest Fires Using Discretization and AdaBoost","authors":"Tomi Bagus Nugroho, E. Sugiharti","doi":"10.15294/jaist.v3i1.49094","DOIUrl":null,"url":null,"abstract":"Data mining is a process used to help analyze data obtained from certain circumstances with a mathematical approach. The decision tree is an algorithm that is often used in data mining. One of the Decision tree algorithms is the C4.5 algorithm. Data mining consists of preprocessing, data mining, pattern evaluation, and knowledge presentation in its application. Forest fire data used were taken from the UCI Machine Learning Repository. Data normalization, data transformation, and discretization are used to preprocess data in research. To improve accuracy, the C4.5 algorithm can be combined with AdaBoost. This study aims to determine how the application of discretization to the C4.5 algorithm with AdaBoost predicts forest fires and determines the increase in its accuracy. Based on the results of ten k-fold cross-validations, the highest accuracy value obtained is 98.04%. The implementation of discretization and AdaBoost increased the accuracy of forest fire predictions by 13.42%.","PeriodicalId":418742,"journal":{"name":"Journal of Advances in Information Systems and Technology","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advances in Information Systems and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15294/jaist.v3i1.49094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Data mining is a process used to help analyze data obtained from certain circumstances with a mathematical approach. The decision tree is an algorithm that is often used in data mining. One of the Decision tree algorithms is the C4.5 algorithm. Data mining consists of preprocessing, data mining, pattern evaluation, and knowledge presentation in its application. Forest fire data used were taken from the UCI Machine Learning Repository. Data normalization, data transformation, and discretization are used to preprocess data in research. To improve accuracy, the C4.5 algorithm can be combined with AdaBoost. This study aims to determine how the application of discretization to the C4.5 algorithm with AdaBoost predicts forest fires and determines the increase in its accuracy. Based on the results of ten k-fold cross-validations, the highest accuracy value obtained is 98.04%. The implementation of discretization and AdaBoost increased the accuracy of forest fire predictions by 13.42%.
基于离散化和AdaBoost的C4.5算法对森林火灾预测精度的提高
数据挖掘是一种使用数学方法帮助分析从特定情况下获得的数据的过程。决策树是数据挖掘中常用的一种算法。决策树算法之一是C4.5算法。数据挖掘在其应用中包括预处理、数据挖掘、模式评估和知识表示。使用的森林火灾数据来自UCI机器学习存储库。研究中采用数据归一化、数据变换和离散化等方法对数据进行预处理。为了提高精度,C4.5算法可以与AdaBoost结合使用。本研究旨在确定如何将离散化应用于AdaBoost的C4.5算法预测森林火灾,并确定其准确性的提高。经10 k倍交叉验证,得到的最高准确率为98.04%。离散化和AdaBoost的实施使森林火灾预测的准确性提高了13.42%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信