{"title":"Predicting the presence of E. coli in tap water using machine learning in Nepal","authors":"So Kuroki, Ryuji Ogata, M. Sakamoto","doi":"10.1111/wej.12844","DOIUrl":null,"url":null,"abstract":"Within developing countries, a multitude of problems that affect the water supply process can result in the contamination of water taps. While machine learning applications have become popular for attaining efficient water quality predictions, acquiring the necessary data for modelling for developing countries is challenging. This study constructs water quality prediction models by machine learning with a pseudo‐pipeline network to complement the missing data of the water supply process. Using both water source and water tap quality information measured by the Government of Nepal, we apply the three machine learning models: support vector machine (SVM), random forest (RF) and LightGBM. Furthermore, we also apply a traditional statistical method—logistic regression (LR)—to the prediction of the Escherichia coli (E. coli) contamination in water taps. With some input variables (such as the length from the nearest sources) obtained from the pseudo‐pipeline network, the results show that SVM has stable and high accuracy for both the 26 cities (70%) and for the 25 cities except for Kathmandu (79%). LR performed a significantly lower accuracy for all cities (61%) than for 25 cities (79%). Additionally, we show that our method can be applied to other regions where a water quality survey has not yet been conducted.","PeriodicalId":23753,"journal":{"name":"Water and Environment Journal","volume":"37 1","pages":"402 - 411"},"PeriodicalIF":1.7000,"publicationDate":"2023-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water and Environment Journal","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1111/wej.12844","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Within developing countries, a multitude of problems that affect the water supply process can result in the contamination of water taps. While machine learning applications have become popular for attaining efficient water quality predictions, acquiring the necessary data for modelling for developing countries is challenging. This study constructs water quality prediction models by machine learning with a pseudo‐pipeline network to complement the missing data of the water supply process. Using both water source and water tap quality information measured by the Government of Nepal, we apply the three machine learning models: support vector machine (SVM), random forest (RF) and LightGBM. Furthermore, we also apply a traditional statistical method—logistic regression (LR)—to the prediction of the Escherichia coli (E. coli) contamination in water taps. With some input variables (such as the length from the nearest sources) obtained from the pseudo‐pipeline network, the results show that SVM has stable and high accuracy for both the 26 cities (70%) and for the 25 cities except for Kathmandu (79%). LR performed a significantly lower accuracy for all cities (61%) than for 25 cities (79%). Additionally, we show that our method can be applied to other regions where a water quality survey has not yet been conducted.
期刊介绍:
Water and Environment Journal is an internationally recognised peer reviewed Journal for the dissemination of innovations and solutions focussed on enhancing water management best practice. Water and Environment Journal is available to over 12,000 institutions with a further 7,000 copies physically distributed to the Chartered Institution of Water and Environmental Management (CIWEM) membership, comprised of environment sector professionals based across the value chain (utilities, consultancy, technology suppliers, regulators, government and NGOs). As such, the journal provides a conduit between academics and practitioners. We therefore particularly encourage contributions focussed at the interface between academia and industry, which deliver industrially impactful applied research underpinned by scientific evidence. We are keen to attract papers on a broad range of subjects including:
-Water and wastewater treatment for agricultural, municipal and industrial applications
-Sludge treatment including processing, storage and management
-Water recycling
-Urban and stormwater management
-Integrated water management strategies
-Water infrastructure and distribution
-Climate change mitigation including management of impacts on agriculture, urban areas and infrastructure