滑坡易感性分区基准数据集和工作流程

IF 10.8 1区 地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY
Massimiliano Alvioli , Marco Loche , Liesbet Jacobs , Carlos H. Grohmann , Minu Treesa Abraham , Kunal Gupta , Neelima Satyam , Gianvito Scaringi , Txomin Bornaetxea , Mauro Rossi , Ivan Marchesini , Luigi Lombardo , Mateo Moreno , Stefan Steger , Corrado A.S. Camera , Greta Bajni , Guruh Samodra , Erwin Eko Wahyudi , Nanang Susyanto , Marko Sinčić , Jhonatan Rivera-Rivera
{"title":"滑坡易感性分区基准数据集和工作流程","authors":"Massimiliano Alvioli ,&nbsp;Marco Loche ,&nbsp;Liesbet Jacobs ,&nbsp;Carlos H. Grohmann ,&nbsp;Minu Treesa Abraham ,&nbsp;Kunal Gupta ,&nbsp;Neelima Satyam ,&nbsp;Gianvito Scaringi ,&nbsp;Txomin Bornaetxea ,&nbsp;Mauro Rossi ,&nbsp;Ivan Marchesini ,&nbsp;Luigi Lombardo ,&nbsp;Mateo Moreno ,&nbsp;Stefan Steger ,&nbsp;Corrado A.S. Camera ,&nbsp;Greta Bajni ,&nbsp;Guruh Samodra ,&nbsp;Erwin Eko Wahyudi ,&nbsp;Nanang Susyanto ,&nbsp;Marko Sinčić ,&nbsp;Jhonatan Rivera-Rivera","doi":"10.1016/j.earscirev.2024.104927","DOIUrl":null,"url":null,"abstract":"<div><p>Landslide susceptibility shows the spatial likelihood of landslide occurrence in a specific geographical area and is a relevant tool for mitigating the impact of landslides worldwide. As such, it is the subject of countless scientific studies. Many methods exist for generating a susceptibility map, mostly falling under the definition of statistical or machine learning. These models try to solve a classification problem: given a collection of spatial variables, and their combination associated with landslide presence or absence, a model should be trained, tested to reproduce the target outcome, and eventually applied to unseen data.</p><p>Contrary to many fields of science that use machine learning for specific tasks, no reference data exist to assess the performance of a given method for landslide susceptibility. Here, we propose a benchmark dataset consisting of 7360 slope units encompassing an area of about <span><math><mn>4,100</mn><mspace></mspace><msup><mi>km</mi><mn>2</mn></msup></math></span> in Central Italy. Using the dataset, we tried to answer two open questions in landslide research: (1) what effect does the human variability have in creating susceptibility models; (2) how can we develop a reproducible workflow for allowing meaningful model comparisons within the landslide susceptibility research community.</p><p>With these questions in mind, we released a preliminary version of the dataset, along with a “call for collaboration,” aimed at collecting different calculations using the proposed data, and leaving the freedom of implementation to the respondents. Contributions were different in many respects, including classification methods, use of predictors, implementation of training/validation, and performance assessment. That feedback suggested refining the initial dataset, and constraining the implementation workflow. This resulted in a final benchmark dataset and landslide susceptibility maps obtained with many classification methods.</p><p>Values of area under the receiver operating characteristic curve obtained with the final benchmark dataset were rather similar, as an effect of constraints on training, cross–validation, and use of data. Brier score results show larger variability, instead, ascribed to different model predictive abilities. Correlation plots show similarities between results of different methods applied by the same group, ascribed to a residual implementation dependence.</p><p>We stress that the experiment did not intend to select the “best” method but only to establish a first benchmark dataset and workflow, that may be useful as a standard reference for calculations by other scholars. The experiment, to our knowledge, is the first of its kind for landslide susceptibility modeling. The data and workflow presented here comparatively assess the performance of independent methods for landslide susceptibility and we suggest the benchmark approach as a best practice for quantitative research in geosciences.</p></div>","PeriodicalId":11483,"journal":{"name":"Earth-Science Reviews","volume":"258 ","pages":"Article 104927"},"PeriodicalIF":10.8000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0012825224002551/pdfft?md5=ec4ffeb5b8f126bf82473e863d41ca1f&pid=1-s2.0-S0012825224002551-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A benchmark dataset and workflow for landslide susceptibility zonation\",\"authors\":\"Massimiliano Alvioli ,&nbsp;Marco Loche ,&nbsp;Liesbet Jacobs ,&nbsp;Carlos H. Grohmann ,&nbsp;Minu Treesa Abraham ,&nbsp;Kunal Gupta ,&nbsp;Neelima Satyam ,&nbsp;Gianvito Scaringi ,&nbsp;Txomin Bornaetxea ,&nbsp;Mauro Rossi ,&nbsp;Ivan Marchesini ,&nbsp;Luigi Lombardo ,&nbsp;Mateo Moreno ,&nbsp;Stefan Steger ,&nbsp;Corrado A.S. Camera ,&nbsp;Greta Bajni ,&nbsp;Guruh Samodra ,&nbsp;Erwin Eko Wahyudi ,&nbsp;Nanang Susyanto ,&nbsp;Marko Sinčić ,&nbsp;Jhonatan Rivera-Rivera\",\"doi\":\"10.1016/j.earscirev.2024.104927\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Landslide susceptibility shows the spatial likelihood of landslide occurrence in a specific geographical area and is a relevant tool for mitigating the impact of landslides worldwide. As such, it is the subject of countless scientific studies. Many methods exist for generating a susceptibility map, mostly falling under the definition of statistical or machine learning. These models try to solve a classification problem: given a collection of spatial variables, and their combination associated with landslide presence or absence, a model should be trained, tested to reproduce the target outcome, and eventually applied to unseen data.</p><p>Contrary to many fields of science that use machine learning for specific tasks, no reference data exist to assess the performance of a given method for landslide susceptibility. Here, we propose a benchmark dataset consisting of 7360 slope units encompassing an area of about <span><math><mn>4,100</mn><mspace></mspace><msup><mi>km</mi><mn>2</mn></msup></math></span> in Central Italy. Using the dataset, we tried to answer two open questions in landslide research: (1) what effect does the human variability have in creating susceptibility models; (2) how can we develop a reproducible workflow for allowing meaningful model comparisons within the landslide susceptibility research community.</p><p>With these questions in mind, we released a preliminary version of the dataset, along with a “call for collaboration,” aimed at collecting different calculations using the proposed data, and leaving the freedom of implementation to the respondents. Contributions were different in many respects, including classification methods, use of predictors, implementation of training/validation, and performance assessment. That feedback suggested refining the initial dataset, and constraining the implementation workflow. This resulted in a final benchmark dataset and landslide susceptibility maps obtained with many classification methods.</p><p>Values of area under the receiver operating characteristic curve obtained with the final benchmark dataset were rather similar, as an effect of constraints on training, cross–validation, and use of data. Brier score results show larger variability, instead, ascribed to different model predictive abilities. Correlation plots show similarities between results of different methods applied by the same group, ascribed to a residual implementation dependence.</p><p>We stress that the experiment did not intend to select the “best” method but only to establish a first benchmark dataset and workflow, that may be useful as a standard reference for calculations by other scholars. The experiment, to our knowledge, is the first of its kind for landslide susceptibility modeling. The data and workflow presented here comparatively assess the performance of independent methods for landslide susceptibility and we suggest the benchmark approach as a best practice for quantitative research in geosciences.</p></div>\",\"PeriodicalId\":11483,\"journal\":{\"name\":\"Earth-Science Reviews\",\"volume\":\"258 \",\"pages\":\"Article 104927\"},\"PeriodicalIF\":10.8000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0012825224002551/pdfft?md5=ec4ffeb5b8f126bf82473e863d41ca1f&pid=1-s2.0-S0012825224002551-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Earth-Science Reviews\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0012825224002551\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Earth-Science Reviews","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0012825224002551","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

山体滑坡易发性显示了特定地理区域发生山体滑坡的空间可能性,是减轻全球山体滑坡影响的相关工具。因此,它是无数科学研究的主题。目前有许多生成易损性地图的方法,大多属于统计或机器学习的范畴。这些模型试图解决一个分类问题:给定一系列空间变量及其与滑坡存在或不存在相关的组合,一个模型应被训练、测试以重现目标结果,并最终应用于未见数据。与许多将机器学习用于特定任务的科学领域相反,没有参考数据可用于评估特定方法在滑坡易感性方面的性能。在此,我们提出了一个基准数据集,由 7360 个斜坡单元组成,覆盖意大利中部约 4100 平方公里的区域。利用该数据集,我们试图回答滑坡研究中的两个开放性问题:(1) 在创建易感性模型时,人为变异会产生什么影响;(2) 我们如何才能开发出一种可重复的工作流程,以便在滑坡易感性研究界内进行有意义的模型比较。答复者在分类方法、预测因子的使用、培训/验证的实施以及性能评估等许多方面提出了不同的意见。这些反馈建议完善初始数据集,并限制实施工作流程。最终基准数据集获得的接收器工作特征曲线下面积值相当相似,这是对训练、交叉验证和数据使用的限制所产生的影响。布赖尔得分结果反而显示出更大的差异,这归因于不同的模型预测能力。我们强调,该实验并不打算选出 "最佳 "方法,而只是想建立第一个基准数据集和工作流程,为其他学者的计算提供标准参考。据我们所知,该实验是首个用于滑坡易感性建模的同类实验。这里介绍的数据和工作流程比较评估了滑坡易感性独立方法的性能,我们建议将基准方法作为地球科学定量研究的最佳实践。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A benchmark dataset and workflow for landslide susceptibility zonation

Landslide susceptibility shows the spatial likelihood of landslide occurrence in a specific geographical area and is a relevant tool for mitigating the impact of landslides worldwide. As such, it is the subject of countless scientific studies. Many methods exist for generating a susceptibility map, mostly falling under the definition of statistical or machine learning. These models try to solve a classification problem: given a collection of spatial variables, and their combination associated with landslide presence or absence, a model should be trained, tested to reproduce the target outcome, and eventually applied to unseen data.

Contrary to many fields of science that use machine learning for specific tasks, no reference data exist to assess the performance of a given method for landslide susceptibility. Here, we propose a benchmark dataset consisting of 7360 slope units encompassing an area of about 4,100km2 in Central Italy. Using the dataset, we tried to answer two open questions in landslide research: (1) what effect does the human variability have in creating susceptibility models; (2) how can we develop a reproducible workflow for allowing meaningful model comparisons within the landslide susceptibility research community.

With these questions in mind, we released a preliminary version of the dataset, along with a “call for collaboration,” aimed at collecting different calculations using the proposed data, and leaving the freedom of implementation to the respondents. Contributions were different in many respects, including classification methods, use of predictors, implementation of training/validation, and performance assessment. That feedback suggested refining the initial dataset, and constraining the implementation workflow. This resulted in a final benchmark dataset and landslide susceptibility maps obtained with many classification methods.

Values of area under the receiver operating characteristic curve obtained with the final benchmark dataset were rather similar, as an effect of constraints on training, cross–validation, and use of data. Brier score results show larger variability, instead, ascribed to different model predictive abilities. Correlation plots show similarities between results of different methods applied by the same group, ascribed to a residual implementation dependence.

We stress that the experiment did not intend to select the “best” method but only to establish a first benchmark dataset and workflow, that may be useful as a standard reference for calculations by other scholars. The experiment, to our knowledge, is the first of its kind for landslide susceptibility modeling. The data and workflow presented here comparatively assess the performance of independent methods for landslide susceptibility and we suggest the benchmark approach as a best practice for quantitative research in geosciences.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Earth-Science Reviews
Earth-Science Reviews 地学-地球科学综合
CiteScore
21.70
自引率
5.80%
发文量
294
审稿时长
15.1 weeks
期刊介绍: Covering a much wider field than the usual specialist journals, Earth Science Reviews publishes review articles dealing with all aspects of Earth Sciences, and is an important vehicle for allowing readers to see their particular interest related to the Earth Sciences as a whole.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信