使用 RF 和 FR-RF 模型,基于训练样本和测试样本的不同比例绘制滑坡易感性地图的比较研究

Ke Xu , Zhou Zhao , Wei Chen , Jianquan Ma , Fei Liu , Yihao Zhang , Zijun Ren
{"title":"使用 RF 和 FR-RF 模型,基于训练样本和测试样本的不同比例绘制滑坡易感性地图的比较研究","authors":"Ke Xu ,&nbsp;Zhou Zhao ,&nbsp;Wei Chen ,&nbsp;Jianquan Ma ,&nbsp;Fei Liu ,&nbsp;Yihao Zhang ,&nbsp;Zijun Ren","doi":"10.1016/j.nhres.2023.07.004","DOIUrl":null,"url":null,"abstract":"<div><p>Evaluation of landslide susceptibility is essential to planning of land and space utilization. For this purpose, the paper presents a case study from Fugu County, Shaanxi Province, China. Firstly, the geological environment and current state of landslides in Fugu County were investigated. Then, slope, aspect, terrain relief, curvature, lithology, land type, and normalized difference vegetation index (NDVI) were considered as the landslide susceptibility condition factors, and the correlation between these carried out by using Multicollinearity Analysis method. Next, landslide and non-landslide samples were divided into training samples and testing samples according to the sample <em>ratios</em> of 8/2, 7/3, 6/4, and 5/5, respectively. The landslide susceptibility mapping was carried out by using Random Forest (RF) model and Frequency Ratio coupled with Random Forest (FR-RF) model, respectively. Lastly, the landslide density (LD), landslide frequency ratio (LFR), the area under the curve (AUC) of the receiver operator, and other indicators were used to validate the rationality, accuracy, and performance of the landslide susceptibility maps produced from different models and <em>ratios</em>. The results indicated that all maps are reasonable, except the map when <em>ratio</em> is 5/5. For each map, regardless of <em>ratios</em>, the LD and LFR are the greatest in the zones classed as having a very high susceptibility, followed by those with a high, moderate, low, and very low classes.</p><p>In the Random Forest (RF) model, when the training test set is not at the same time its in the area of extremely high sensitivity of LD and the size of the FR value respectively 7/3 (201.026) ​&gt; ​8/2 (154.440) ​&gt; ​6/4 (93.696) &gt;5/5 (136.364) and 7/3 (4.806) ​&gt; ​8/2 (3.692) ​&gt; ​6/4 (3.260) ​&gt; ​5/5 (2.240); in the Frequency Ratio coupled with Random Forest (FR-RF) model, Inall the training test sets the size of the proportion of LD and FR value respectively 7/3 (145.693) ​&gt; ​6/4 (127.151) ​&gt; ​5/5 (122.857) ​&gt; ​8/2 (113.263) and 7/3 (3.334) ​&gt; ​6/4 (3.073) ​&gt; ​5/5 (2.811) ​&gt; ​8/2 (2.592). What else, from the comparison of ROC curves, when <em>ratio</em> is 7/3, the accuracy of the two models is higher than that of other <em>ratios</em>. Similarly, the results of the ensemble model (A combination of two models with different learning abilities.) are not more reasonable than the results of the single model, which reflects that the combination of a weaker learner model (Frequency Ratio model here) with a stronger learner model (Random Forest model here) can diminish the performance of the stronger model.</p></div>","PeriodicalId":100943,"journal":{"name":"Natural Hazards Research","volume":"4 1","pages":"Pages 62-74"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666592123000732/pdfft?md5=57f6bcca382435f449d5967b78339074&pid=1-s2.0-S2666592123000732-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Comparative study on landslide susceptibility mapping based on different ratios of training samples and testing samples by using RF and FR-RF models\",\"authors\":\"Ke Xu ,&nbsp;Zhou Zhao ,&nbsp;Wei Chen ,&nbsp;Jianquan Ma ,&nbsp;Fei Liu ,&nbsp;Yihao Zhang ,&nbsp;Zijun Ren\",\"doi\":\"10.1016/j.nhres.2023.07.004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Evaluation of landslide susceptibility is essential to planning of land and space utilization. For this purpose, the paper presents a case study from Fugu County, Shaanxi Province, China. Firstly, the geological environment and current state of landslides in Fugu County were investigated. Then, slope, aspect, terrain relief, curvature, lithology, land type, and normalized difference vegetation index (NDVI) were considered as the landslide susceptibility condition factors, and the correlation between these carried out by using Multicollinearity Analysis method. Next, landslide and non-landslide samples were divided into training samples and testing samples according to the sample <em>ratios</em> of 8/2, 7/3, 6/4, and 5/5, respectively. The landslide susceptibility mapping was carried out by using Random Forest (RF) model and Frequency Ratio coupled with Random Forest (FR-RF) model, respectively. Lastly, the landslide density (LD), landslide frequency ratio (LFR), the area under the curve (AUC) of the receiver operator, and other indicators were used to validate the rationality, accuracy, and performance of the landslide susceptibility maps produced from different models and <em>ratios</em>. The results indicated that all maps are reasonable, except the map when <em>ratio</em> is 5/5. For each map, regardless of <em>ratios</em>, the LD and LFR are the greatest in the zones classed as having a very high susceptibility, followed by those with a high, moderate, low, and very low classes.</p><p>In the Random Forest (RF) model, when the training test set is not at the same time its in the area of extremely high sensitivity of LD and the size of the FR value respectively 7/3 (201.026) ​&gt; ​8/2 (154.440) ​&gt; ​6/4 (93.696) &gt;5/5 (136.364) and 7/3 (4.806) ​&gt; ​8/2 (3.692) ​&gt; ​6/4 (3.260) ​&gt; ​5/5 (2.240); in the Frequency Ratio coupled with Random Forest (FR-RF) model, Inall the training test sets the size of the proportion of LD and FR value respectively 7/3 (145.693) ​&gt; ​6/4 (127.151) ​&gt; ​5/5 (122.857) ​&gt; ​8/2 (113.263) and 7/3 (3.334) ​&gt; ​6/4 (3.073) ​&gt; ​5/5 (2.811) ​&gt; ​8/2 (2.592). What else, from the comparison of ROC curves, when <em>ratio</em> is 7/3, the accuracy of the two models is higher than that of other <em>ratios</em>. Similarly, the results of the ensemble model (A combination of two models with different learning abilities.) are not more reasonable than the results of the single model, which reflects that the combination of a weaker learner model (Frequency Ratio model here) with a stronger learner model (Random Forest model here) can diminish the performance of the stronger model.</p></div>\",\"PeriodicalId\":100943,\"journal\":{\"name\":\"Natural Hazards Research\",\"volume\":\"4 1\",\"pages\":\"Pages 62-74\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666592123000732/pdfft?md5=57f6bcca382435f449d5967b78339074&pid=1-s2.0-S2666592123000732-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Natural Hazards Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666592123000732\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Hazards Research","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666592123000732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

滑坡易发性评估对土地和空间利用规划至关重要。为此,本文介绍了中国陕西省府谷县的一个案例研究。首先,对府谷县的地质环境和滑坡现状进行了调查。然后,将坡度、坡向、地形起伏、曲率、岩性、土地类型和归一化差异植被指数(NDVI)作为滑坡易发条件因子,并利用多重共线性分析方法对这些因子之间的相关性进行了分析。然后,按照 8/2、7/3、6/4 和 5/5 的样本比例将滑坡样本和非滑坡样本分为训练样本和测试样本。分别使用随机森林(RF)模型和频率比耦合随机森林(FR-RF)模型绘制滑坡易感性图。最后,利用滑坡密度(LD)、滑坡频率比(LFR)、接收算子曲线下面积(AUC)等指标验证了不同模型和比例绘制的滑坡易感性图的合理性、准确性和性能。结果表明,除比率为 5/5 时的地图外,其他地图都是合理的。在随机森林(RF)模型中,当训练测试集不在同一时间时,其在极高敏感度区域的 LD 和 LFR 值大小分别为 7/3 (201.026) > 8/2 (154.440) > 6/4 (93.696)>5/5(136.364)和 7/3(4.806)>8/2(3.692)>6/4(3.260)>5/5(2.240);在频率比耦合随机森林(FR-RF)模型中,在所有训练测试集中,LD 和 FR 值的比例大小分别为 7/3(145.693);6/4(127.151);5/5(122.857);8/2(113.263)和 7/3(3.334);6/4(3.073);5/5(2.811);8/2(2.592)。另外,从 ROC 曲线的比较来看,当比率为 7/3 时,两个模型的准确率高于其他比率。同样,集合模型(由两个学习能力不同的模型组合而成)的结果也没有比单一模型的结果更合理,这反映了较弱的学习者模型(这里是频率比模型)与较强的学习者模型(这里是随机森林模型)的组合会削弱较强模型的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparative study on landslide susceptibility mapping based on different ratios of training samples and testing samples by using RF and FR-RF models

Evaluation of landslide susceptibility is essential to planning of land and space utilization. For this purpose, the paper presents a case study from Fugu County, Shaanxi Province, China. Firstly, the geological environment and current state of landslides in Fugu County were investigated. Then, slope, aspect, terrain relief, curvature, lithology, land type, and normalized difference vegetation index (NDVI) were considered as the landslide susceptibility condition factors, and the correlation between these carried out by using Multicollinearity Analysis method. Next, landslide and non-landslide samples were divided into training samples and testing samples according to the sample ratios of 8/2, 7/3, 6/4, and 5/5, respectively. The landslide susceptibility mapping was carried out by using Random Forest (RF) model and Frequency Ratio coupled with Random Forest (FR-RF) model, respectively. Lastly, the landslide density (LD), landslide frequency ratio (LFR), the area under the curve (AUC) of the receiver operator, and other indicators were used to validate the rationality, accuracy, and performance of the landslide susceptibility maps produced from different models and ratios. The results indicated that all maps are reasonable, except the map when ratio is 5/5. For each map, regardless of ratios, the LD and LFR are the greatest in the zones classed as having a very high susceptibility, followed by those with a high, moderate, low, and very low classes.

In the Random Forest (RF) model, when the training test set is not at the same time its in the area of extremely high sensitivity of LD and the size of the FR value respectively 7/3 (201.026) ​> ​8/2 (154.440) ​> ​6/4 (93.696) >5/5 (136.364) and 7/3 (4.806) ​> ​8/2 (3.692) ​> ​6/4 (3.260) ​> ​5/5 (2.240); in the Frequency Ratio coupled with Random Forest (FR-RF) model, Inall the training test sets the size of the proportion of LD and FR value respectively 7/3 (145.693) ​> ​6/4 (127.151) ​> ​5/5 (122.857) ​> ​8/2 (113.263) and 7/3 (3.334) ​> ​6/4 (3.073) ​> ​5/5 (2.811) ​> ​8/2 (2.592). What else, from the comparison of ROC curves, when ratio is 7/3, the accuracy of the two models is higher than that of other ratios. Similarly, the results of the ensemble model (A combination of two models with different learning abilities.) are not more reasonable than the results of the single model, which reflects that the combination of a weaker learner model (Frequency Ratio model here) with a stronger learner model (Random Forest model here) can diminish the performance of the stronger model.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.10
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信