Exploring spatial and temporal importance of input features and the explainability of machine learning-based modelling of water distribution systems

IF 4.1 Q2 ENGINEERING, CHEMICAL

Digital Chemical Engineering Pub Date : 2024-11-27 DOI:10.1016/j.dche.2024.100202

Ammar Riyadh, Nicolas M. Peleato

{"title":"Exploring spatial and temporal importance of input features and the explainability of machine learning-based modelling of water distribution systems","authors":"Ammar Riyadh, Nicolas M. Peleato","doi":"10.1016/j.dche.2024.100202","DOIUrl":null,"url":null,"abstract":"<div><div>Ensuring safe drinking water necessitates advanced management and monitoring techniques for water quality in distribution systems. This study leverages machine learning (ML) to model chlorine decay in a water distribution system (WDS) in British Columbia, Canada. A four-layer long short term memory (LSTM) network was trained to predict chlorine concentrations at a reservoir >24,000 m from the treatment plant. Explainable AI (XAI) techniques were applied to the trained network to address critical issues, such as enhancing the transparency and reliability of ML models. Several XAI methods were used to investigate the importance of sensor placement, identify the most significant features, understand feature ranges that result in poor performance, and validate model logic. Results demonstrated that for ML-based WDS control, sensor location is not critical, with high prediction accuracy achieved (mean absolute error <0.025 mg/L) even when exclusively using data from nodes spatially distant from the prediction site. XAI techniques showed the capability of identifying essential features and demonstrated that the behaviour of the ML model conformed with the expectations of chlorine behaviour. Superfluous variables were ranked low in importance, and the model learned fundamental aspects of chemical kinetics, such as temperature dependence and decay rate. Most importantly, the XAI methods applied showed the capability to communicate the reasoning for specific predictions, even at a local or sample-specific level. This study underscores the importance of transparency and trust in ML models, especially as the field transitions towards digital twin and Internet of Things (IoT) technologies, to enhance the effective management of water quality systems.</div></div>","PeriodicalId":72815,"journal":{"name":"Digital Chemical Engineering","volume":"14 ","pages":"Article 100202"},"PeriodicalIF":4.1000,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Chemical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772508124000644","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Ensuring safe drinking water necessitates advanced management and monitoring techniques for water quality in distribution systems. This study leverages machine learning (ML) to model chlorine decay in a water distribution system (WDS) in British Columbia, Canada. A four-layer long short term memory (LSTM) network was trained to predict chlorine concentrations at a reservoir >24,000 m from the treatment plant. Explainable AI (XAI) techniques were applied to the trained network to address critical issues, such as enhancing the transparency and reliability of ML models. Several XAI methods were used to investigate the importance of sensor placement, identify the most significant features, understand feature ranges that result in poor performance, and validate model logic. Results demonstrated that for ML-based WDS control, sensor location is not critical, with high prediction accuracy achieved (mean absolute error <0.025 mg/L) even when exclusively using data from nodes spatially distant from the prediction site. XAI techniques showed the capability of identifying essential features and demonstrated that the behaviour of the ML model conformed with the expectations of chlorine behaviour. Superfluous variables were ranked low in importance, and the model learned fundamental aspects of chemical kinetics, such as temperature dependence and decay rate. Most importantly, the XAI methods applied showed the capability to communicate the reasoning for specific predictions, even at a local or sample-specific level. This study underscores the importance of transparency and trust in ML models, especially as the field transitions towards digital twin and Internet of Things (IoT) technologies, to enhance the effective management of water quality systems.

查看原文本刊更多论文

探索输入特征的空间和时间重要性以及基于机器学习的水分配系统建模的可解释性

确保安全饮用水需要先进的供水系统水质管理和监测技术。本研究利用机器学习（ML）来模拟加拿大不列颠哥伦比亚省供水系统（WDS）中的氯衰变。一个四层长短期记忆（LSTM）网络被训练来预测距离处理厂24000米的水库的氯浓度。可解释人工智能（XAI）技术被应用于训练后的网络，以解决关键问题，例如提高机器学习模型的透明度和可靠性。使用了几种XAI方法来研究传感器放置的重要性，确定最重要的特征，了解导致性能差的特征范围，并验证模型逻辑。结果表明，对于基于ml的WDS控制，传感器位置并不重要，即使仅使用距离预测地点较远的节点数据，也可以获得较高的预测精度（平均绝对误差<；0.025 mg/L）。XAI技术显示了识别基本特征的能力，并证明ML模型的行为符合氯行为的预期。多余的变量在重要性上排名较低，模型学习化学动力学的基本方面，如温度依赖性和衰变率。最重要的是，所应用的XAI方法显示了沟通特定预测推理的能力，甚至在局部或特定于样本的级别上也是如此。这项研究强调了机器学习模型的透明度和信任的重要性，特别是随着该领域向数字孪生和物联网（IoT）技术的过渡，以加强对水质系统的有效管理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Digital Chemical Engineering

CiteScore

3.10

自引率

0.00%

发文量