{"title":"Investigating the effects of different data classification methods on landslide susceptibility mapping","authors":"Halil Akinci, Ayse Yavuz Ozalp","doi":"10.1016/j.asr.2024.12.020","DOIUrl":null,"url":null,"abstract":"<div><div>In this study, landslide susceptibility maps (LSMs) were produced for three regions where landslides are common in the Eastern Black Sea Region of Türkiye. The regions studied include the districts of Trabzon, Rize and Artvin. The eXtreme Gradient Boosting (XGBoost) machine learning algorithm was used to generate the LSMs. Ten different factors that can affect landslides including lithology, land cover, topographic wetness index (TWI), plan and profile curvature, slope, elevation, aspect, distance to roads and drainages were used for the research. The study tested various spatial data classification methods for these factors. Specifically, the data was categorized using five distinct classification methods: “geometric interval,” “equal interval,” “manual interval,” “natural breaks,” and “quantile.” The main objective of the study was to see how these classification methods affect the accuracy of LSMs. For this purpose, six different models using the XGBoost algorithm were created. In the first model, continuous data was used for most of the factors, while some factors (aspect, land cover and lithology) were used as discrete data. The other five models categorized the data using the different classification methods mentioned above. The receiver operating characteristic (ROC) curve and area under the curve (AUC) approach were used to measure how well each model performed. The results showed that the Model_1 using mostly continuous data performed the best among all three study areas with the highest AUC value. The model with the lowest AUC value was the model using the equal interval classification method (Model_3). The most important finding gained from this study was that when producing LSMs, it is preferable to maintain continuous data as is rather than reclassifying it, as this improves the accuracy of the susceptibility model.</div></div>","PeriodicalId":50850,"journal":{"name":"Advances in Space Research","volume":"75 4","pages":"Pages 3427-3450"},"PeriodicalIF":2.8000,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Space Research","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0273117724012328","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
In this study, landslide susceptibility maps (LSMs) were produced for three regions where landslides are common in the Eastern Black Sea Region of Türkiye. The regions studied include the districts of Trabzon, Rize and Artvin. The eXtreme Gradient Boosting (XGBoost) machine learning algorithm was used to generate the LSMs. Ten different factors that can affect landslides including lithology, land cover, topographic wetness index (TWI), plan and profile curvature, slope, elevation, aspect, distance to roads and drainages were used for the research. The study tested various spatial data classification methods for these factors. Specifically, the data was categorized using five distinct classification methods: “geometric interval,” “equal interval,” “manual interval,” “natural breaks,” and “quantile.” The main objective of the study was to see how these classification methods affect the accuracy of LSMs. For this purpose, six different models using the XGBoost algorithm were created. In the first model, continuous data was used for most of the factors, while some factors (aspect, land cover and lithology) were used as discrete data. The other five models categorized the data using the different classification methods mentioned above. The receiver operating characteristic (ROC) curve and area under the curve (AUC) approach were used to measure how well each model performed. The results showed that the Model_1 using mostly continuous data performed the best among all three study areas with the highest AUC value. The model with the lowest AUC value was the model using the equal interval classification method (Model_3). The most important finding gained from this study was that when producing LSMs, it is preferable to maintain continuous data as is rather than reclassifying it, as this improves the accuracy of the susceptibility model.
期刊介绍:
The COSPAR publication Advances in Space Research (ASR) is an open journal covering all areas of space research including: space studies of the Earth''s surface, meteorology, climate, the Earth-Moon system, planets and small bodies of the solar system, upper atmospheres, ionospheres and magnetospheres of the Earth and planets including reference atmospheres, space plasmas in the solar system, astrophysics from space, materials sciences in space, fundamental physics in space, space debris, space weather, Earth observations of space phenomena, etc.
NB: Please note that manuscripts related to life sciences as related to space are no more accepted for submission to Advances in Space Research. Such manuscripts should now be submitted to the new COSPAR Journal Life Sciences in Space Research (LSSR).
All submissions are reviewed by two scientists in the field. COSPAR is an interdisciplinary scientific organization concerned with the progress of space research on an international scale. Operating under the rules of ICSU, COSPAR ignores political considerations and considers all questions solely from the scientific viewpoint.