Application of Sediment Fingerprinting to Apportion Sediment Sources: Using Machine Learning Models

IF 1.2 4区 农林科学 Q3 AGRICULTURAL ENGINEERING
Kritika Malhotra, Jingyi Zheng, Ash Abebe, Jasmeet Lamba
{"title":"Application of Sediment Fingerprinting to Apportion Sediment Sources: Using Machine Learning Models","authors":"Kritika Malhotra, Jingyi Zheng, Ash Abebe, Jasmeet Lamba","doi":"10.13031/ja.14906","DOIUrl":null,"url":null,"abstract":"Highlights Relative source contributions to stream bed sediment from construction sites and stream banks were quantified. Two machine-learning techniques were used to select composite fingerprinting properties. The MixSIR Bayesian model was employed for source apportionment. Statistical methods employed for fingerprinting properties selection have the potential to impact source apportionments. Management strategies to reduce sediment mobilization should be targeted depending on the dominant source of sediment in each sub-watershed. Abstract. Sediment fingerprinting is an extensively used approach for investigating sediment sources by linking in-stream sediment mixtures with watershed source materials. The overall goal of this research was to estimate the relative source contributions of stream banks and construction sites to the stream bed sediment in an urbanized watershed (Alabama, USA) using a fingerprinting technique established on composite fingerprints selected by two different machine learning techniques at a sub-watershed scale. The two statistical approaches employed to select the subset of fingerprinting properties were: (1) the Random Forest algorithm (RF) with Gini importance ranking of variables; and (2) logistic regression with the least absolute shrinkage and selection operator (LASSO). A Bayesian mixing model was then used to estimate the distribution of mixing proportions along with the associated uncertainty. The models were built based on the composite fingerprints selected using the two machine learning methods. Overall, using the subset of fingerprints selected by RF and LASSO, the relative contribution of stream banks ranged from 14±9% to 97±2% and from 24±18% to 94±5%, respectively, throughout the watershed. The stream bank contributions were compared with a previous study conducted in the watershed that utilized a two-step statistical procedure (which involved a Mann-Whitney U-test as the first step and discriminant function analysis (DFA) as the second step) to select the composite of fingerprinting properties and a frequentist mixing model to calculate the source apportionments. The relative contributions of stream banks to stream bed sediment in the previous study reported ranged from 9±8% to 100±1%. Therefore, the study demonstrated the dependence of source attributions on the statistical procedures used to select the optimum composite fingerprints for sediment fingerprinting applications. Furthermore, the results underscored the importance of using different mixing model structures to obtain reliable estimates of source contributions. Keywords: Least absolute shrinkage and selection operator (LASSO), MixSIR Bayesian model, Random Forest (RF), Statistical techniques.","PeriodicalId":29714,"journal":{"name":"Journal of the ASABE","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the ASABE","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13031/ja.14906","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Highlights Relative source contributions to stream bed sediment from construction sites and stream banks were quantified. Two machine-learning techniques were used to select composite fingerprinting properties. The MixSIR Bayesian model was employed for source apportionment. Statistical methods employed for fingerprinting properties selection have the potential to impact source apportionments. Management strategies to reduce sediment mobilization should be targeted depending on the dominant source of sediment in each sub-watershed. Abstract. Sediment fingerprinting is an extensively used approach for investigating sediment sources by linking in-stream sediment mixtures with watershed source materials. The overall goal of this research was to estimate the relative source contributions of stream banks and construction sites to the stream bed sediment in an urbanized watershed (Alabama, USA) using a fingerprinting technique established on composite fingerprints selected by two different machine learning techniques at a sub-watershed scale. The two statistical approaches employed to select the subset of fingerprinting properties were: (1) the Random Forest algorithm (RF) with Gini importance ranking of variables; and (2) logistic regression with the least absolute shrinkage and selection operator (LASSO). A Bayesian mixing model was then used to estimate the distribution of mixing proportions along with the associated uncertainty. The models were built based on the composite fingerprints selected using the two machine learning methods. Overall, using the subset of fingerprints selected by RF and LASSO, the relative contribution of stream banks ranged from 14±9% to 97±2% and from 24±18% to 94±5%, respectively, throughout the watershed. The stream bank contributions were compared with a previous study conducted in the watershed that utilized a two-step statistical procedure (which involved a Mann-Whitney U-test as the first step and discriminant function analysis (DFA) as the second step) to select the composite of fingerprinting properties and a frequentist mixing model to calculate the source apportionments. The relative contributions of stream banks to stream bed sediment in the previous study reported ranged from 9±8% to 100±1%. Therefore, the study demonstrated the dependence of source attributions on the statistical procedures used to select the optimum composite fingerprints for sediment fingerprinting applications. Furthermore, the results underscored the importance of using different mixing model structures to obtain reliable estimates of source contributions. Keywords: Least absolute shrinkage and selection operator (LASSO), MixSIR Bayesian model, Random Forest (RF), Statistical techniques.
沉积物指纹识别在沉积物来源分配中的应用:使用机器学习模型
对建筑工地和河岸对河床沉积物的相对源贡献进行了量化。使用了两种机器学习技术来选择复合指纹属性。采用MixSIR贝叶斯模型进行源分配。用于指纹属性选择的统计方法有可能影响源分配。减少泥沙淤积的管理战略应根据每个小流域的主要泥沙来源而定。摘要沉积物指纹图谱是一种广泛使用的研究沉积物来源的方法,通过将河流内沉积物混合物与流域源物质联系起来。本研究的总体目标是利用两种不同的机器学习技术在子流域尺度上选择的复合指纹建立指纹技术,估计城市化流域(美国阿拉巴马州)的河岸和建筑工地对河床沉积物的相对来源贡献。采用两种统计方法选择指纹属性子集:(1)随机森林算法(RF),对变量进行基尼重要度排序;(2)最小绝对收缩和选择算子(LASSO)的逻辑回归。然后使用贝叶斯混合模型来估计混合比例的分布以及相关的不确定性。基于两种机器学习方法选择的复合指纹建立模型。总体而言,使用RF和LASSO选择的指纹子集,整个流域中河岸的相对贡献范围分别为14±9% ~ 97±2%和24±18% ~ 94±5%。将河岸贡献与先前在流域中进行的研究进行了比较,该研究使用两步统计程序(包括Mann-Whitney u检验作为第一步,判别函数分析(DFA)作为第二步)选择指纹特性的组合,并使用频率混合模型计算源分配。以往研究报道的河岸对河床沉积物的相对贡献范围为9±8% ~ 100±1%。因此,该研究表明,源属性依赖于用于选择沉积物指纹的最佳复合指纹的统计程序。此外,结果强调了使用不同的混合模型结构来获得可靠的源贡献估计的重要性。关键词:最小绝对收缩和选择算子(LASSO), MixSIR贝叶斯模型,随机森林,统计技术
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.10
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信