Data aggregation, ML ready datasets, and an API: leveraging diverse data to create enhanced characterizations of monsoon flood risk

IF 4.1 Q2 ENVIRONMENTAL SCIENCES

Frontiers in Climate Pub Date : 2023-07-13 DOI:10.3389/fclim.2023.1107363

Dharma Hoy, Rey L. Granillo, Leland Boeman, B. McMahan, M. Crimmins

{"title":"Data aggregation, ML ready datasets, and an API: leveraging diverse data to create enhanced characterizations of monsoon flood risk","authors":"Dharma Hoy, Rey L. Granillo, Leland Boeman, B. McMahan, M. Crimmins","doi":"10.3389/fclim.2023.1107363","DOIUrl":null,"url":null,"abstract":"Monsoon precipitation and severe flooding is highly variable and often unpredictable, with a range of flood conditions and impacts across metropolitan regions or a county. County and storm specific watches or warnings issued by the National Weather Service (NWS) alert the public to current flood conditions and risks, but floods are not limited to the area that is under alert and these zones can be relatively coarse depending on the data these warnings are based on. Research done by the Arizona Institute for Resilient Environments and Societies (AIRES) has produced an Application Programming Interface (API) accessible data warehouse of time series precipitation totals across the state of Arizona which consists of higher resolution geographically disperse data that helped create improved characterizations of monsoon precipitation variability. There is an opportunity to leverage these data to address flood risk particularly where advanced Computer Science methodologies and Machine Learning techniques may offer additional spatial and temporal insight into flood events. This can be especially useful during rainfall events where precipitation station reporting frequencies are increased and near real-time totals are accessible via the AIRES API. A Machine-Learning-ready dataset structured to train ML models facilitates an anticipatory approach to predicting/characterizing flood risk. This presents an opportunity for new inputs into management and decision making opportunities, in addition to describing precipitation and flood patterns after an event. In this paper we will be the first to make use of the AIRES API by taking the initial step of the Machine Learning process and assembling the precipitation data into a ML-ready dataset. We then look closer at the dataset assembled and call attention to characteristics of the dataset that can be further explored through machine learning processes. Finally, we will summarize future directions for research and climate services using this dataset and API.","PeriodicalId":33632,"journal":{"name":"Frontiers in Climate","volume":" ","pages":""},"PeriodicalIF":4.1000,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Climate","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fclim.2023.1107363","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Monsoon precipitation and severe flooding is highly variable and often unpredictable, with a range of flood conditions and impacts across metropolitan regions or a county. County and storm specific watches or warnings issued by the National Weather Service (NWS) alert the public to current flood conditions and risks, but floods are not limited to the area that is under alert and these zones can be relatively coarse depending on the data these warnings are based on. Research done by the Arizona Institute for Resilient Environments and Societies (AIRES) has produced an Application Programming Interface (API) accessible data warehouse of time series precipitation totals across the state of Arizona which consists of higher resolution geographically disperse data that helped create improved characterizations of monsoon precipitation variability. There is an opportunity to leverage these data to address flood risk particularly where advanced Computer Science methodologies and Machine Learning techniques may offer additional spatial and temporal insight into flood events. This can be especially useful during rainfall events where precipitation station reporting frequencies are increased and near real-time totals are accessible via the AIRES API. A Machine-Learning-ready dataset structured to train ML models facilitates an anticipatory approach to predicting/characterizing flood risk. This presents an opportunity for new inputs into management and decision making opportunities, in addition to describing precipitation and flood patterns after an event. In this paper we will be the first to make use of the AIRES API by taking the initial step of the Machine Learning process and assembling the precipitation data into a ML-ready dataset. We then look closer at the dataset assembled and call attention to characteristics of the dataset that can be further explored through machine learning processes. Finally, we will summarize future directions for research and climate services using this dataset and API.

查看原文本刊更多论文

数据聚合、支持ML的数据集和API：利用不同的数据创建增强的季风洪水风险特征

季风降水和严重洪水变化很大，往往难以预测，在大都市地区或一个县有一系列洪水条件和影响。国家气象局(NWS)发布的县和风暴特定手表或警告提醒公众当前的洪水状况和风险，但洪水并不局限于警报区域，这些区域可能相对粗糙，这取决于这些警告所基于的数据。亚利桑那弹性环境与社会研究所(AIRES)所做的研究已经产生了一个应用程序编程接口(API)可访问的数据仓库，该仓库涵盖了整个亚利桑那州的时间序列降水总量，该仓库由更高分辨率的地理分散数据组成，有助于改进季风降水变化的特征。我们有机会利用这些数据来应对洪水风险，特别是在先进的计算机科学方法和机器学习技术可以为洪水事件提供额外的空间和时间洞察力的情况下。这在降雨事件中特别有用，因为降水站点报告频率增加，并且可以通过AIRES API获得近实时总数。机器学习就绪的数据集用于训练机器学习模型，有助于预测/表征洪水风险的预测方法。除了描述事件后的降水和洪水模式外，这为管理和决策提供了新的输入机会。在本文中，我们将首先利用AIRES API，采取机器学习过程的初始步骤，并将降水数据组装成ml就绪的数据集。然后，我们仔细观察组装的数据集，并提请注意数据集的特征，这些特征可以通过机器学习过程进一步探索。最后，我们将总结使用该数据集和API的未来研究方向和气候服务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊