Predicting Climate Change Related Extreme Natural Disasters Using Machine Learning in Zambia

Zambia ICT Journal Pub Date : 2022-12-26 DOI:10.33260/zictjournal.v6i1.128

D. Phiri, C. Chembe

{"title":"Predicting Climate Change Related Extreme Natural Disasters Using Machine Learning in Zambia","authors":"D. Phiri, C. Chembe","doi":"10.33260/zictjournal.v6i1.128","DOIUrl":null,"url":null,"abstract":"One of the most important concerns affecting humanity today is climate change that has led to increased frequency of natural disasters that threaten social and economic stability to populations. Zambia’s vulnerability to the threat of disasters remains high because the country still lacks an effective Early Warning System (EWS). This study recognises the need to evaluate various Machine Learning (ML) algorithms, that have been successfully implemented in disaster prediction, in order to develop a model for Zambia. Six ML algorithms, namely; Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbor (KNN), Gaussian Naive Bayes (GNB), Decision Tree (DT), and Support Vector Machine (SVM), have been compared from which the best performing is chosen. The historical climate data is obtained from the Zambia Meteorological Department (ZMD) while historical natural disasters data was obtained online because it is not locally available. The study results show that LR and SVM algorithms performed better than the others, both scoring 73.0% accuracy, respectively. LR is chosen to produce the final model because it has a shorter computational time compared to SVM. The model is then incorporated in a web service and android application for deployment. However, the high number of outliers, missing values and highly imbalanced classes affect the performance of the model. ML data cleaning and feature engineering techniques, such as Data Imputation and Oversampling Techniques, are applied but certain challenges still persist because these tools have their own flaws. Therefore, the model’s performance in a real-world data environment is likely to be affected.","PeriodicalId":206279,"journal":{"name":"Zambia ICT Journal","volume":"151 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Zambia ICT Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33260/zictjournal.v6i1.128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

One of the most important concerns affecting humanity today is climate change that has led to increased frequency of natural disasters that threaten social and economic stability to populations. Zambia’s vulnerability to the threat of disasters remains high because the country still lacks an effective Early Warning System (EWS). This study recognises the need to evaluate various Machine Learning (ML) algorithms, that have been successfully implemented in disaster prediction, in order to develop a model for Zambia. Six ML algorithms, namely; Logistic Regression (LR), Random Forest (RF), K-Nearest Neighbor (KNN), Gaussian Naive Bayes (GNB), Decision Tree (DT), and Support Vector Machine (SVM), have been compared from which the best performing is chosen. The historical climate data is obtained from the Zambia Meteorological Department (ZMD) while historical natural disasters data was obtained online because it is not locally available. The study results show that LR and SVM algorithms performed better than the others, both scoring 73.0% accuracy, respectively. LR is chosen to produce the final model because it has a shorter computational time compared to SVM. The model is then incorporated in a web service and android application for deployment. However, the high number of outliers, missing values and highly imbalanced classes affect the performance of the model. ML data cleaning and feature engineering techniques, such as Data Imputation and Oversampling Techniques, are applied but certain challenges still persist because these tools have their own flaws. Therefore, the model’s performance in a real-world data environment is likely to be affected.

查看原文本刊更多论文

在赞比亚使用机器学习预测与气候变化相关的极端自然灾害

当今影响人类的最重要问题之一是气候变化，它导致自然灾害的频率增加，威胁到人口的社会和经济稳定。赞比亚面对灾害威胁的脆弱性仍然很高，因为该国仍然缺乏有效的预警系统(EWS)。本研究认识到有必要评估各种机器学习(ML)算法，这些算法已经成功地应用于灾害预测，以便为赞比亚开发一个模型。六种ML算法，即;对逻辑回归(LR)、随机森林(RF)、k近邻(KNN)、高斯朴素贝叶斯(GNB)、决策树(DT)和支持向量机(SVM)进行了比较，从中选择了表现最佳的方法。历史气候数据来自赞比亚气象部门(ZMD)，而历史自然灾害数据由于在当地无法获得而在网上获得。研究结果表明，LR和SVM算法的准确率分别为73.0%，优于其他算法。选择LR来生成最终模型，因为与SVM相比，它的计算时间更短。然后将该模型合并到web服务和android应用程序中进行部署。然而，大量的异常值、缺失值和高度不平衡的类会影响模型的性能。机器学习数据清洗和特征工程技术(如数据输入和过采样技术)得到了应用，但由于这些工具有其自身的缺陷，某些挑战仍然存在。因此，模型在真实数据环境中的性能可能会受到影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Zambia ICT Journal

自引率

0.00%

发文量