Enhancing Debris Flow Warning via Machine Learning Feature Reduction and Model Selection

IF 3.8 2区地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY

Journal of Geophysical Research: Earth Surface Pub Date : 2025-04-26 DOI:10.1029/2024JF008094

Qi Zhou, Hui Tang, Clément Hibert, Małgorzata Chmiel, Fabian Walter, Michael Dietze, Jens M. Turowski

{"title":"Enhancing Debris Flow Warning via Machine Learning Feature Reduction and Model Selection","authors":"Qi Zhou, Hui Tang, Clément Hibert, Małgorzata Chmiel, Fabian Walter, Michael Dietze, Jens M. Turowski","doi":"10.1029/2024JF008094","DOIUrl":null,"url":null,"abstract":"<p>The advent of machine learning has significantly improved the accuracy of identifying mass movements through the seismic waves they generate, making it possible to implement real-time early warning systems for debris flows. However, we lack a profound understanding of the effective seismic features and the limitations of different machine learning models. In this work, we investigate eighty seismic features and three machine learning models for single-station-based binary debris flow classification and multi-station-based warning tasks. These seismic features, derived from physical and statistical knowledge of impact sources, are grouped into five sets: Benford's law, waveform, spectra, spectrogram, and network. The machine learning models belong to two families: two ensemble models, Random Forest and eXtreme Gradient Boosting (XGBoost); one recurrent neural network model, Long Short-Term Memory (LSTM). We analyzed feature importance from the ensemble models and found that the number and even the types of seismic features are not critical for training an effective binary classifier for debris flow. When using models designed to capture patterns in sequential data rather than focusing on information only in one given window, using the LSTM does not significantly improve the performance of binary debris flow classification task over Random Forest and XGBoost. For the multi-station-based debris flow warning task, the LSTM model predicts debris flow probability more consistently and provides longer warning times. Our proposed framework simplifies machine learning-driven debris flow classification and lays the foundation for affordable seismic signal-driven early warning using a sparse seismic network.</p>","PeriodicalId":15887,"journal":{"name":"Journal of Geophysical Research: Earth Surface","volume":"130 4","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1029/2024JF008094","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Geophysical Research: Earth Surface","FirstCategoryId":"89","ListUrlMain":"https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2024JF008094","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

The advent of machine learning has significantly improved the accuracy of identifying mass movements through the seismic waves they generate, making it possible to implement real-time early warning systems for debris flows. However, we lack a profound understanding of the effective seismic features and the limitations of different machine learning models. In this work, we investigate eighty seismic features and three machine learning models for single-station-based binary debris flow classification and multi-station-based warning tasks. These seismic features, derived from physical and statistical knowledge of impact sources, are grouped into five sets: Benford's law, waveform, spectra, spectrogram, and network. The machine learning models belong to two families: two ensemble models, Random Forest and eXtreme Gradient Boosting (XGBoost); one recurrent neural network model, Long Short-Term Memory (LSTM). We analyzed feature importance from the ensemble models and found that the number and even the types of seismic features are not critical for training an effective binary classifier for debris flow. When using models designed to capture patterns in sequential data rather than focusing on information only in one given window, using the LSTM does not significantly improve the performance of binary debris flow classification task over Random Forest and XGBoost. For the multi-station-based debris flow warning task, the LSTM model predicts debris flow probability more consistently and provides longer warning times. Our proposed framework simplifies machine learning-driven debris flow classification and lays the foundation for affordable seismic signal-driven early warning using a sparse seismic network.

Abstract Image

查看原文本刊更多论文

基于机器学习特征还原和模型选择的泥石流预警

机器学习的出现大大提高了通过地震波识别物体运动的准确性，使实施泥石流实时预警系统成为可能。然而，我们对有效的地震特征和不同机器学习模型的局限性缺乏深刻的理解。在这项工作中，我们研究了80个地震特征和三种机器学习模型，用于基于单站的二元泥石流分类和基于多站的预警任务。这些地震特征来源于震源的物理和统计知识，分为五组：本福德定律、波形、谱、谱图和网络。机器学习模型分为两大类：两个集成模型，随机森林模型和极限梯度增强模型（XGBoost）；一种递归神经网络模型，长短期记忆（LSTM）。我们从集合模型中分析了特征的重要性，发现地震特征的数量甚至类型对于训练有效的泥石流二元分类器并不重要。当使用旨在捕获序列数据模式的模型而不是只关注一个给定窗口的信息时，使用LSTM并没有显着提高随机森林和XGBoost的二进制泥石流分类任务的性能。对于基于多台站的泥石流预警任务，LSTM模型对泥石流概率的预测更加一致，预警时间更长。我们提出的框架简化了机器学习驱动的泥石流分类，并为使用稀疏地震网络进行经济实惠的地震信号驱动预警奠定了基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Geophysical Research: Earth Surface Earth and Planetary Sciences-Earth-Surface Processes

CiteScore

6.30

自引率

10.30%

发文量

162