Madalitso Mng’ombe, B. Chunga, Eddie W. Mtonga, R. Chidya, M. Malota
{"title":"使用自组织地图填充传统污水处理厂的缺失数据和异常值:马拉维利隆圭Kauma污水处理厂的案例研究","authors":"Madalitso Mng’ombe, B. Chunga, Eddie W. Mtonga, R. Chidya, M. Malota","doi":"10.2166/h2oj.2023.013","DOIUrl":null,"url":null,"abstract":"\n \n Data availability is key for modeling of wastewater treatment processes. However, process data are characterized by missing values and outliers. This study applied a self-organizing map (SOM), to fill in missing values and replace outliers in wastewater treatment data from Kauma Sewage Treatment Plant in Lilongwe, Malawi. We used primary and secondary wastewater data and executed the SOM algorithm to fill missing values and replace outliers in effluent pH, biochemical oxygen demand, and dissolved oxygen. The results suggest that SOM algorithm is reliable in filling gaps in wastewater time series data with less than 50% missing values with correlation coefficient (R) values of >0.90. The SOM algorithm failed to reliably fill gaps and replace outliers in time series data with >50% missing values. For instance, high mean square error (MSE) values of 3,655.57, 10.62, and 2,153.34 for pH, DO, and BOD, respectively, were registered in datasets with more than 50% missing values, while very small MSE values (MSE ≈ 0) were associated with effluent pH, BOD, and DO data with missing values of >50%. Practitioners can use this approach to improve the planning and management of wastewater treatment facilities where available data records are riddled with missing observations.","PeriodicalId":36060,"journal":{"name":"H2Open Journal","volume":" ","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Infilling missing data and outliers for a conventional sewage treatment plant using a self-organizing map: a case study of Kauma Sewage Treatment Plant in Lilongwe, Malawi\",\"authors\":\"Madalitso Mng’ombe, B. Chunga, Eddie W. Mtonga, R. Chidya, M. Malota\",\"doi\":\"10.2166/h2oj.2023.013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n \\n Data availability is key for modeling of wastewater treatment processes. However, process data are characterized by missing values and outliers. This study applied a self-organizing map (SOM), to fill in missing values and replace outliers in wastewater treatment data from Kauma Sewage Treatment Plant in Lilongwe, Malawi. We used primary and secondary wastewater data and executed the SOM algorithm to fill missing values and replace outliers in effluent pH, biochemical oxygen demand, and dissolved oxygen. The results suggest that SOM algorithm is reliable in filling gaps in wastewater time series data with less than 50% missing values with correlation coefficient (R) values of >0.90. The SOM algorithm failed to reliably fill gaps and replace outliers in time series data with >50% missing values. For instance, high mean square error (MSE) values of 3,655.57, 10.62, and 2,153.34 for pH, DO, and BOD, respectively, were registered in datasets with more than 50% missing values, while very small MSE values (MSE ≈ 0) were associated with effluent pH, BOD, and DO data with missing values of >50%. Practitioners can use this approach to improve the planning and management of wastewater treatment facilities where available data records are riddled with missing observations.\",\"PeriodicalId\":36060,\"journal\":{\"name\":\"H2Open Journal\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2023-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"H2Open Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2166/h2oj.2023.013\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"WATER RESOURCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"H2Open Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2166/h2oj.2023.013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"WATER RESOURCES","Score":null,"Total":0}
Infilling missing data and outliers for a conventional sewage treatment plant using a self-organizing map: a case study of Kauma Sewage Treatment Plant in Lilongwe, Malawi
Data availability is key for modeling of wastewater treatment processes. However, process data are characterized by missing values and outliers. This study applied a self-organizing map (SOM), to fill in missing values and replace outliers in wastewater treatment data from Kauma Sewage Treatment Plant in Lilongwe, Malawi. We used primary and secondary wastewater data and executed the SOM algorithm to fill missing values and replace outliers in effluent pH, biochemical oxygen demand, and dissolved oxygen. The results suggest that SOM algorithm is reliable in filling gaps in wastewater time series data with less than 50% missing values with correlation coefficient (R) values of >0.90. The SOM algorithm failed to reliably fill gaps and replace outliers in time series data with >50% missing values. For instance, high mean square error (MSE) values of 3,655.57, 10.62, and 2,153.34 for pH, DO, and BOD, respectively, were registered in datasets with more than 50% missing values, while very small MSE values (MSE ≈ 0) were associated with effluent pH, BOD, and DO data with missing values of >50%. Practitioners can use this approach to improve the planning and management of wastewater treatment facilities where available data records are riddled with missing observations.