Fuzzy C-Means clustering for physical model calibration and 7-day, 10-year low flow estimation in ungaged basins: comparisons to traditional, statistical estimates

Andrew DelSanto, Richard N. Palmer, Konstantinos Andreadis
{"title":"Fuzzy C-Means clustering for physical model calibration and 7-day, 10-year low flow estimation in ungaged basins: comparisons to traditional, statistical estimates","authors":"Andrew DelSanto, Richard N. Palmer, Konstantinos Andreadis","doi":"10.3389/frwa.2024.1332888","DOIUrl":null,"url":null,"abstract":"In the northeast U.S., resource managers commonly apply 7-day, 10-year (7Q10) low flow estimates for protecting aquatic species in streams. In this paper, the efficacy of process-based hydrologic models is evaluated for estimating 7Q10s compared to the United States Geological Survey's (USGS) widely applied web-application StreamStats, which uses traditional statistical regression equations for estimating extreme flows. To generate the process-based estimates, the USGS's National Hydrologic Modeling (NHM-PRMS) framework (which relies on traditional rainfall-runoff modeling) is applied with 36 years of forcings from the Daymet climate dataset to a representative sample of ninety-four unimpaired gages in the Northeast and Mid-Atlantic U.S. The rainfall-runoff models are calibrated to the measured streamflow at each gage using the recommended NHM-PRMS calibration procedure and evaluated using Kling-Gupta Efficiency (KGE) for daily streamflow estimation. To evaluate the 7Q10 estimates made by the rainfall-runoff models compared to StreamStats, a multitude of error metrics are applied, including median relative bias (cfs/cfs), Root Mean Square Error (RMSE) (cfs), Relative RMSE (RRMSE) (cfs/cfs), and Unit-Area RMSE (UA-RMSE) (cfs/mi2). The calibrated rainfall-runoff models display both improved daily streamflow estimation (median KGE improving from 0.30 to 0.52) and 7Q10 estimation (smaller median relative bias, RMSE, RRMSE, and UA-RMSE, especially for basins larger than 100 mi2). The success of calibration is extended to ungaged locations using the machine learning algorithm Fuzzy C-Means (FCM) clustering, finding that traditional K-Means clustering (FCM clustering with no fuzzification factor) is the preferred method for model regionalization based on (1) Silhouette Analysis, (2) daily streamflow KGE, and (3) 7Q10 error metrics. The optimal rainfall-runoff models created with clustering show improvement for daily streamflow estimation (a median KGE of 0.48, only slightly below that of the calibrated models at 0.52); however, these models display similar error metrics for 7Q10 estimation compared to the uncalibrated models, neither of which provide improved error compared to the statistical estimates. Results suggest that the rainfall-runoff models calibrated to measured streamflow data provide the best 7Q10 estimation in terms of all error metrics except median relative bias, but for all models applicable to ungaged locations, the statistical estimates from StreamStats display the lowest error metrics in every category.","PeriodicalId":504613,"journal":{"name":"Frontiers in Water","volume":"41 24","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Water","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frwa.2024.1332888","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In the northeast U.S., resource managers commonly apply 7-day, 10-year (7Q10) low flow estimates for protecting aquatic species in streams. In this paper, the efficacy of process-based hydrologic models is evaluated for estimating 7Q10s compared to the United States Geological Survey's (USGS) widely applied web-application StreamStats, which uses traditional statistical regression equations for estimating extreme flows. To generate the process-based estimates, the USGS's National Hydrologic Modeling (NHM-PRMS) framework (which relies on traditional rainfall-runoff modeling) is applied with 36 years of forcings from the Daymet climate dataset to a representative sample of ninety-four unimpaired gages in the Northeast and Mid-Atlantic U.S. The rainfall-runoff models are calibrated to the measured streamflow at each gage using the recommended NHM-PRMS calibration procedure and evaluated using Kling-Gupta Efficiency (KGE) for daily streamflow estimation. To evaluate the 7Q10 estimates made by the rainfall-runoff models compared to StreamStats, a multitude of error metrics are applied, including median relative bias (cfs/cfs), Root Mean Square Error (RMSE) (cfs), Relative RMSE (RRMSE) (cfs/cfs), and Unit-Area RMSE (UA-RMSE) (cfs/mi2). The calibrated rainfall-runoff models display both improved daily streamflow estimation (median KGE improving from 0.30 to 0.52) and 7Q10 estimation (smaller median relative bias, RMSE, RRMSE, and UA-RMSE, especially for basins larger than 100 mi2). The success of calibration is extended to ungaged locations using the machine learning algorithm Fuzzy C-Means (FCM) clustering, finding that traditional K-Means clustering (FCM clustering with no fuzzification factor) is the preferred method for model regionalization based on (1) Silhouette Analysis, (2) daily streamflow KGE, and (3) 7Q10 error metrics. The optimal rainfall-runoff models created with clustering show improvement for daily streamflow estimation (a median KGE of 0.48, only slightly below that of the calibrated models at 0.52); however, these models display similar error metrics for 7Q10 estimation compared to the uncalibrated models, neither of which provide improved error compared to the statistical estimates. Results suggest that the rainfall-runoff models calibrated to measured streamflow data provide the best 7Q10 estimation in terms of all error metrics except median relative bias, but for all models applicable to ungaged locations, the statistical estimates from StreamStats display the lowest error metrics in every category.
用于物理模型校准和无测站流域 7 天 10 年低流量估算的模糊 C-Means 聚类:与传统统计估算的比较
在美国东北部,资源管理人员通常采用 7 天 10 年(7Q10)低流量估算来保护溪流中的水生物种。本文评估了基于过程的水文模型与美国地质调查局(USGS)广泛应用的网络应用程序 StreamStats 在估算 7Q10 方面的功效,后者使用传统的统计回归方程估算极端流量。为了生成基于过程的估算值,美国地质调查局的国家水文建模(NHM-PRMS)框架(依赖于传统的降雨-径流建模)被应用到 Daymet 气候数据集的 36 年馈源中,样本包括美国东北部和大西洋中部的九十四个未受损测站。采用推荐的 NHM-PRMS 校准程序,将降雨-径流模型与每个测站的实测溪流进行校准,并使用 Kling-Gupta 效率 (KGE) 对日溪流估算进行评估。为了评估降雨-径流模型与 StreamStats 相比得出的 7Q10 估算值,采用了多种误差指标,包括相对偏差中值(立方英尺/立方英尺)、均方根误差(RMSE)(立方英尺)、相对均方根误差(RRMSE)(立方英尺/立方英尺)和单位面积均方根误差(UA-RMSE)(立方英尺/平方米)。经过校核的降雨-径流模式在日径流量估算(KGE 中位数从 0.30 提高到 0.52)和 7Q10 估算(相对偏差、RMSE、RRMSE 和 UA-RMSE 中位数较小,尤其是对于面积大于 100 平方英里的流域)方面都有所改进。使用机器学习算法模糊 C-Means(FCM)聚类,将校准的成功经验推广到无测站的地点,发现传统的 K-Means 聚类(无模糊化因子的 FCM 聚类)是根据(1)轮廓分析(Silhouette Analysis)、(2)日溪流 KGE 和(3)7Q10 误差指标进行模型区域化的首选方法。通过聚类创建的最优降雨径流模型在日径流量估算方面有所改进(KGE 中位数为 0.48,仅略低于校核模型的 0.52);然而,与未校核模型相比,这些模型在 7Q10 估算方面显示出相似的误差指标,与统计估算相比,误差均未得到改善。结果表明,根据实测溪流数据校核的降雨-径流模型在除相对偏差中位数以外的所有误差指标方面都提供了最佳的 7Q10 估算,但对于适用于无测站地点的所有模型,StreamStats 的统计估算在每个类别中都显示出最低的误差指标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信