{"title":"KOMBINASI METODE K-MEANS DAN DECISION TREE DENGAN PERBANDINGAN KRITERIA DAN SPLIT DATA","authors":"Elly Muningsih","doi":"10.33365/jti.v16i1.1561","DOIUrl":null,"url":null,"abstract":"Data mining is a process of looking for patterns or pulling large and selected data information using certain techniques or methods. The K-Means and Decision Tree methods are part of the Data Mining technique. This study will combine the K-Means method to cluster data into three clusters then the results of the clustering will be classified using the Decision Tree Method with a comparison of the Gain Ratio, Information Gain and Gini Index criteria. The data is processed into two, namely training data and testing data with a percentage of 70:30, 80:20 and 90:10. The results of the research are to find out which criteria produce the best decision tree and performance based on the highest accuracy value from each data group. The data is taken from the UCI Repository with a total of 811 records and 52 attributes. From the data processing carried out, it is known that for the 70:30 data split, the accuracy value with the Gain Ratio, Information Gain and Gini Index criteria gets the same value, namely 97.53. The Gain Ratio and Gini Index criteria produce the highest accuracy value, which is 98.15% for 80:20 split data. While Information Gain got the highest accuracy value of 98.77% for 90:10 data split. Keyword : data mining, clustering, k-means, classification, decision tree","PeriodicalId":344455,"journal":{"name":"Jurnal Teknoinfo","volume":"86 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Teknoinfo","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33365/jti.v16i1.1561","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
KOMBINASI METODE K-MEANS DAN DECISION TREE DENGAN PERBANDINGAN KRITERIA DAN SPLIT DATA
Data mining is a process of looking for patterns or pulling large and selected data information using certain techniques or methods. The K-Means and Decision Tree methods are part of the Data Mining technique. This study will combine the K-Means method to cluster data into three clusters then the results of the clustering will be classified using the Decision Tree Method with a comparison of the Gain Ratio, Information Gain and Gini Index criteria. The data is processed into two, namely training data and testing data with a percentage of 70:30, 80:20 and 90:10. The results of the research are to find out which criteria produce the best decision tree and performance based on the highest accuracy value from each data group. The data is taken from the UCI Repository with a total of 811 records and 52 attributes. From the data processing carried out, it is known that for the 70:30 data split, the accuracy value with the Gain Ratio, Information Gain and Gini Index criteria gets the same value, namely 97.53. The Gain Ratio and Gini Index criteria produce the highest accuracy value, which is 98.15% for 80:20 split data. While Information Gain got the highest accuracy value of 98.77% for 90:10 data split. Keyword : data mining, clustering, k-means, classification, decision tree