Erwin Kurniawan, F. Nhita, A. Aditsania, D. Saepudin
{"title":"万隆县降雨预报的C5.0算法与合成少数派过采样技术(SMOTE","authors":"Erwin Kurniawan, F. Nhita, A. Aditsania, D. Saepudin","doi":"10.1109/ICoICT.2019.8835324","DOIUrl":null,"url":null,"abstract":"Weather is an essential aspect of life because it can affect human activities. Therefore, it is important for weather prediction to have high accuracy. One of the methods used to predict rainfall is data mining. In this study, a classification model was developed using the C5.0 algorithm to forecast rainfall in Bandung Regency. Then, the SMOTE algorithm was used to overcome imbalanced datasets. Weather data for the model development were obtained from the Meteorological, Climatological, and Geophysical Agency (BMKG) of Bandung for the years 2005 until 2017. Subsequently, the model was validated using a k-fold cross-validation. The results of the C5.0 test produced the highest accuracy of 92% for the imbalance dataset, while the accuracy of the addition of data using the SMOTE technique was 99%.","PeriodicalId":439440,"journal":{"name":"2019 7th International Conference on Information and Communication Technology (ICoICT)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"C5.0 Algorithm and Synthetic Minority Oversampling Technique (SMOTE) for Rainfall Forecasting in Bandung Regency\",\"authors\":\"Erwin Kurniawan, F. Nhita, A. Aditsania, D. Saepudin\",\"doi\":\"10.1109/ICoICT.2019.8835324\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Weather is an essential aspect of life because it can affect human activities. Therefore, it is important for weather prediction to have high accuracy. One of the methods used to predict rainfall is data mining. In this study, a classification model was developed using the C5.0 algorithm to forecast rainfall in Bandung Regency. Then, the SMOTE algorithm was used to overcome imbalanced datasets. Weather data for the model development were obtained from the Meteorological, Climatological, and Geophysical Agency (BMKG) of Bandung for the years 2005 until 2017. Subsequently, the model was validated using a k-fold cross-validation. The results of the C5.0 test produced the highest accuracy of 92% for the imbalance dataset, while the accuracy of the addition of data using the SMOTE technique was 99%.\",\"PeriodicalId\":439440,\"journal\":{\"name\":\"2019 7th International Conference on Information and Communication Technology (ICoICT)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 7th International Conference on Information and Communication Technology (ICoICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICoICT.2019.8835324\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 7th International Conference on Information and Communication Technology (ICoICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICoICT.2019.8835324","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
C5.0 Algorithm and Synthetic Minority Oversampling Technique (SMOTE) for Rainfall Forecasting in Bandung Regency
Weather is an essential aspect of life because it can affect human activities. Therefore, it is important for weather prediction to have high accuracy. One of the methods used to predict rainfall is data mining. In this study, a classification model was developed using the C5.0 algorithm to forecast rainfall in Bandung Regency. Then, the SMOTE algorithm was used to overcome imbalanced datasets. Weather data for the model development were obtained from the Meteorological, Climatological, and Geophysical Agency (BMKG) of Bandung for the years 2005 until 2017. Subsequently, the model was validated using a k-fold cross-validation. The results of the C5.0 test produced the highest accuracy of 92% for the imbalance dataset, while the accuracy of the addition of data using the SMOTE technique was 99%.