ARCHI: A New R Package for Automated Imputation of Regionally Correlated Hydrologic Records.

Ground water Pub Date : 2025-02-28 DOI:10.1111/gwat.13474
Zeno F Levy, Robin L Glas, Timothy J Stagnitta, Neil Terry
{"title":"ARCHI: A New R Package for Automated Imputation of Regionally Correlated Hydrologic Records.","authors":"Zeno F Levy, Robin L Glas, Timothy J Stagnitta, Neil Terry","doi":"10.1111/gwat.13474","DOIUrl":null,"url":null,"abstract":"<p><p>Missing data in hydrological records can limit resource assessment, process understanding, and predictive modeling. Here, we present ARCHI (Automated Regional Correlation Analysis for Hydrologic Record Imputation), a new, open-source software package in R designed to aggregate, impute, cluster, and visualize regionally correlated hydrologic records. ARCHI imputes missing data in \"target\" records by linear regression using more complete \"reference\" records as predictors. Automated imputation is implemented using a novel, iterative algorithm that allows each site to be considered a target or reference for regression, growing the pool of complete references with each imputed record until viable gap-filling ceases. Users can limit artifacts from spurious correlations by specifying model-acceptance criteria and applying geospatial, correlation, and group-based filters to control reference selection. ARCHI provides additional functions for visualizing results, clustering records with similar correlation structures, evaluating holdout data, and interactive parameterization with an accessible and intuitive graphical user interface (GUI). This methods brief provides an overview of the ARCHI package, modeling guidelines, and benchmarking on two regional groundwater-level datasets from the Central Valley, CA and Long Island, NY. We evaluate ARCHI alongside widely used multivariate imputation software to highlight and contextualize its computational efficiency, imputation accuracy, and model transparency when applied to large, groundwater-level datasets.</p>","PeriodicalId":94022,"journal":{"name":"Ground water","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ground water","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/gwat.13474","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Missing data in hydrological records can limit resource assessment, process understanding, and predictive modeling. Here, we present ARCHI (Automated Regional Correlation Analysis for Hydrologic Record Imputation), a new, open-source software package in R designed to aggregate, impute, cluster, and visualize regionally correlated hydrologic records. ARCHI imputes missing data in "target" records by linear regression using more complete "reference" records as predictors. Automated imputation is implemented using a novel, iterative algorithm that allows each site to be considered a target or reference for regression, growing the pool of complete references with each imputed record until viable gap-filling ceases. Users can limit artifacts from spurious correlations by specifying model-acceptance criteria and applying geospatial, correlation, and group-based filters to control reference selection. ARCHI provides additional functions for visualizing results, clustering records with similar correlation structures, evaluating holdout data, and interactive parameterization with an accessible and intuitive graphical user interface (GUI). This methods brief provides an overview of the ARCHI package, modeling guidelines, and benchmarking on two regional groundwater-level datasets from the Central Valley, CA and Long Island, NY. We evaluate ARCHI alongside widely used multivariate imputation software to highlight and contextualize its computational efficiency, imputation accuracy, and model transparency when applied to large, groundwater-level datasets.

ARCHI:用于区域相关水文记录自动估算的新 R 软件包。
水文记录中缺少的数据会限制资源评估、过程理解和预测建模。本文介绍了ARCHI (Automated Regional Correlation Analysis for Hydrologic Record Imputation),这是一个用R语言编写的新的开源软件包,用于聚合、Imputation、聚类和可视化区域相关水文记录。ARCHI使用更完整的“参考”记录作为预测因子,通过线性回归来推算“目标”记录中缺失的数据。自动输入使用一种新颖的迭代算法实现,该算法允许将每个站点视为回归的目标或参考,使用每个输入的记录增加完整的参考池,直到可行的空白填充停止。用户可以通过指定模型接受标准和应用地理空间、相关性和基于组的过滤器来控制参考选择,从而限制伪相关性产生的工件。ARCHI提供了其他功能,用于可视化结果、具有相似关联结构的聚类记录、评估保留数据以及使用可访问且直观的图形用户界面(GUI)进行交互式参数化。该方法简要介绍了ARCHI软件包、建模指南以及来自加利福尼亚州中央山谷和纽约州长岛的两个区域地下水位数据集的基准测试。我们将ARCHI与广泛使用的多元数据输入软件一起进行评估,以突出其计算效率、输入精度和模型透明度,并将其应用于大型地下水位数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信