Leveraging multi-level correlations for imputing monitoring data in water supply systems using graph signal sampling theory

IF 7.2 2区 环境科学与生态学 Q1 ENGINEERING, ENVIRONMENTAL
Xiao Zhou , Yacan Man , Shuming Liu , Juan Zhang , Rui Yuan , Wei Wang , Kuizu Su
{"title":"Leveraging multi-level correlations for imputing monitoring data in water supply systems using graph signal sampling theory","authors":"Xiao Zhou ,&nbsp;Yacan Man ,&nbsp;Shuming Liu ,&nbsp;Juan Zhang ,&nbsp;Rui Yuan ,&nbsp;Wei Wang ,&nbsp;Kuizu Su","doi":"10.1016/j.wroa.2024.100274","DOIUrl":null,"url":null,"abstract":"<div><div>Data missing and anomalies in monitoring equipment have become critical barriers to developing intelligent Water Supply Systems (WSS). The valid data preceding and after the missing segments can be utilized to impute missing values. However, traditional imputation methods, such as linear interpolation and prediction-based methods, have limited capacity to use data relationships or can only utilize information before the missing values. Therefore, existing methods still need to work on efficiently and conveniently achieving high-accuracy imputation. According to the continuity and periodicity of WSS data, missing values often exhibit multi-level correlations with valid data. This paper innovatively employs graph structures to analyze the multi-level correlations at different timestamps and applies graph signal sampling algorithms to extract low-frequency features for imputation. A novel Graph-based Data Imputation (GDI) method has been developed, which leverages multi-level correlations to propagate information and completes imputation tasks without requiring complex feature engineering and pre-training processes. Results indicate that GDI outperforms Holt-Winters, Support Vector Regression, and Gated Recurrent Unit in the task of imputing continuous missing data. It can still achieve <span><math><mrow><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup><mo>&gt;</mo><mn>0.8</mn></mrow></math></span> even when the proportion of missing values reaches 80 %. These results demonstrate that GDI ensures a more streamlined and efficient imputation with high robustness and accuracy.</div></div>","PeriodicalId":52198,"journal":{"name":"Water Research X","volume":"25 ","pages":"Article 100274"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Research X","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589914724000641","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Data missing and anomalies in monitoring equipment have become critical barriers to developing intelligent Water Supply Systems (WSS). The valid data preceding and after the missing segments can be utilized to impute missing values. However, traditional imputation methods, such as linear interpolation and prediction-based methods, have limited capacity to use data relationships or can only utilize information before the missing values. Therefore, existing methods still need to work on efficiently and conveniently achieving high-accuracy imputation. According to the continuity and periodicity of WSS data, missing values often exhibit multi-level correlations with valid data. This paper innovatively employs graph structures to analyze the multi-level correlations at different timestamps and applies graph signal sampling algorithms to extract low-frequency features for imputation. A novel Graph-based Data Imputation (GDI) method has been developed, which leverages multi-level correlations to propagate information and completes imputation tasks without requiring complex feature engineering and pre-training processes. Results indicate that GDI outperforms Holt-Winters, Support Vector Regression, and Gated Recurrent Unit in the task of imputing continuous missing data. It can still achieve R2>0.8 even when the proportion of missing values reaches 80 %. These results demonstrate that GDI ensures a more streamlined and efficient imputation with high robustness and accuracy.

Abstract Image

利用图信号采样理论,利用多级相关性对供水系统中的监测数据进行归类
监测设备中的数据缺失和异常已成为开发智能供水系统(WSS)的关键障碍。可以利用缺失段前后的有效数据来估算缺失值。然而,传统的估算方法,如线性插值法和基于预测的方法,利用数据关系的能力有限,或者只能利用缺失值之前的信息。因此,现有方法仍需在高效、便捷地实现高精度估算方面下功夫。根据 WSS 数据的连续性和周期性,缺失值往往与有效数据呈现多层次的相关性。本文创新性地采用图结构分析不同时间戳的多级相关性,并应用图信号采样算法提取低频特征进行归约。我们开发了一种新颖的基于图的数据归约(GDI)方法,该方法利用多级相关性传播信息,无需复杂的特征工程和预训练过程即可完成归约任务。结果表明,在连续缺失数据的归约任务中,GDI 优于 Holt-Winters、支持向量回归和门控循环单元。即使缺失值比例达到 80%,它仍能达到 R2>0.8。这些结果表明,GDI 可以确保更简化、更高效的归因,并具有很高的稳健性和准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Water Research X
Water Research X Environmental Science-Water Science and Technology
CiteScore
12.30
自引率
1.30%
发文量
19
期刊介绍: Water Research X is a sister journal of Water Research, which follows a Gold Open Access model. It focuses on publishing concise, letter-style research papers, visionary perspectives and editorials, as well as mini-reviews on emerging topics. The Journal invites contributions from researchers worldwide on various aspects of the science and technology related to the human impact on the water cycle, water quality, and its global management.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信