类不平衡技术在实际滑坡预测中的比较

2017 International Conference on Machine Learning and Data Science (MLDS) Pub Date : 2017-12-01 DOI:10.1109/MLDS.2017.21

Kapil Agrawal, Yashasvi Baweja, Deepti Dwivedi, Ritwik Saha, P. Prasad, Shubham Agrawal, S. Kapoor, Pratik Chaturvedi, N. Mali, Venkata Uday Kala, V. Dutt

{"title":"类不平衡技术在实际滑坡预测中的比较","authors":"Kapil Agrawal, Yashasvi Baweja, Deepti Dwivedi, Ritwik Saha, P. Prasad, Shubham Agrawal, S. Kapoor, Pratik Chaturvedi, N. Mali, Venkata Uday Kala, V. Dutt","doi":"10.1109/MLDS.2017.21","DOIUrl":null,"url":null,"abstract":"Landslides cause lots of damage to life and property world over. There has been research in machine-learning that aims to predict landslides based on the statistical analysis of historical landslide events and its triggering factors. However, prediction of landslides suffers from a class-imbalance problem as landslides and land-movement are very rare events. In this paper, we apply state-of-the-art techniques to correct the class imbalance in landslide datasets. More specifically, to overcome the class-imbalance problem, we use different synthetic and oversampling techniques to a real-world landslide data collected from the Chandigarh - Manali highway. Also, we apply several machine-learning algorithms to the landslide data set for predicting landslides and evaluating our algorithms. Different algorithms have been assessed using techniques like the area under the ROC curve (AUC) and sensitivity index (d'). Results suggested that random forest algorithm performed better compared to other classification techniques like neural networks, logistic regression, support vector machines, and decision trees. Furthermore, among class-imbalance methods, the Synthetic Minority Oversampling Technique with iterative partitioning filter (SMOTE-IPF) performed better than other techniques. We highlight the implications of our results and methods for predicting landslides in the real world.","PeriodicalId":248656,"journal":{"name":"2017 International Conference on Machine Learning and Data Science (MLDS)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"A Comparison of Class Imbalance Techniques for Real-World Landslide Predictions\",\"authors\":\"Kapil Agrawal, Yashasvi Baweja, Deepti Dwivedi, Ritwik Saha, P. Prasad, Shubham Agrawal, S. Kapoor, Pratik Chaturvedi, N. Mali, Venkata Uday Kala, V. Dutt\",\"doi\":\"10.1109/MLDS.2017.21\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Landslides cause lots of damage to life and property world over. There has been research in machine-learning that aims to predict landslides based on the statistical analysis of historical landslide events and its triggering factors. However, prediction of landslides suffers from a class-imbalance problem as landslides and land-movement are very rare events. In this paper, we apply state-of-the-art techniques to correct the class imbalance in landslide datasets. More specifically, to overcome the class-imbalance problem, we use different synthetic and oversampling techniques to a real-world landslide data collected from the Chandigarh - Manali highway. Also, we apply several machine-learning algorithms to the landslide data set for predicting landslides and evaluating our algorithms. Different algorithms have been assessed using techniques like the area under the ROC curve (AUC) and sensitivity index (d'). Results suggested that random forest algorithm performed better compared to other classification techniques like neural networks, logistic regression, support vector machines, and decision trees. Furthermore, among class-imbalance methods, the Synthetic Minority Oversampling Technique with iterative partitioning filter (SMOTE-IPF) performed better than other techniques. We highlight the implications of our results and methods for predicting landslides in the real world.\",\"PeriodicalId\":248656,\"journal\":{\"name\":\"2017 International Conference on Machine Learning and Data Science (MLDS)\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Machine Learning and Data Science (MLDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MLDS.2017.21\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Machine Learning and Data Science (MLDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MLDS.2017.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

摘要

山体滑坡给世界各地的生命和财产造成了巨大的损失。在机器学习方面已经有研究，旨在基于对历史滑坡事件及其触发因素的统计分析来预测滑坡。然而，由于滑坡和陆地运动是非常罕见的事件，因此滑坡的预测存在一个阶层不平衡的问题。在本文中，我们应用最先进的技术来纠正滑坡数据集的类不平衡。更具体地说，为了克服类别不平衡问题，我们对从昌迪加尔-马纳利高速公路收集的真实滑坡数据使用了不同的合成和过采样技术。此外，我们将几种机器学习算法应用于滑坡数据集，以预测滑坡并评估我们的算法。使用ROC曲线下面积(AUC)和灵敏度指数(d')等技术评估了不同的算法。结果表明，与神经网络、逻辑回归、支持向量机和决策树等其他分类技术相比，随机森林算法表现更好。此外，在类不平衡方法中，基于迭代划分滤波器的合成少数派过采样技术(SMOTE-IPF)表现较好。我们强调了我们的结果和方法对预测现实世界中的滑坡的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Comparison of Class Imbalance Techniques for Real-World Landslide Predictions

Landslides cause lots of damage to life and property world over. There has been research in machine-learning that aims to predict landslides based on the statistical analysis of historical landslide events and its triggering factors. However, prediction of landslides suffers from a class-imbalance problem as landslides and land-movement are very rare events. In this paper, we apply state-of-the-art techniques to correct the class imbalance in landslide datasets. More specifically, to overcome the class-imbalance problem, we use different synthetic and oversampling techniques to a real-world landslide data collected from the Chandigarh - Manali highway. Also, we apply several machine-learning algorithms to the landslide data set for predicting landslides and evaluating our algorithms. Different algorithms have been assessed using techniques like the area under the ROC curve (AUC) and sensitivity index (d'). Results suggested that random forest algorithm performed better compared to other classification techniques like neural networks, logistic regression, support vector machines, and decision trees. Furthermore, among class-imbalance methods, the Synthetic Minority Oversampling Technique with iterative partitioning filter (SMOTE-IPF) performed better than other techniques. We highlight the implications of our results and methods for predicting landslides in the real world.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Conference on Machine Learning and Data Science (MLDS)

自引率

0.00%

发文量