An open source cyberinfrastructure for collecting, processing, storing and accessing high temporal resolution residential water use data

Camilo J. Bastidas Pacheco, Joseph C. Brewer, J. Horsburgh, J. Caraballo
{"title":"An open source cyberinfrastructure for collecting, processing, storing and accessing high temporal resolution residential water use data","authors":"Camilo J. Bastidas Pacheco, Joseph C. Brewer, J. Horsburgh, J. Caraballo","doi":"10.5194/EGUSPHERE-EGU21-6031","DOIUrl":null,"url":null,"abstract":"<p>Collecting and managing high temporal resolution (< 1 minute) residential water use data is challenging due to cost and technical requirements associated with the volume and velocity of data collected. It is well known that this type of data has potential to expand our knowledge of residential water use, inform future water use predictions, and improve water conservation strategies. However, most studies collecting this type of data have been focused on the practical application of the data (e.g., developing and applying end use disaggregation algorithms) with much less focus on how the data were collected, retrieved, quality controlled, and managed to enable data visualization and analysis. We developed an open-source, modular, generalized cyberinfrastructure system to automate the process from data collection to analysis. The system has three main architectural components: first, the sensors and dataloggers for water use monitoring; second, the data communication, parsing and archival tools; and third, the analyses, visualization and presentations of data produced for different audiences. For the first component, we present a low-cost datalogging device, designed for installation on top of existing, analog, magnetically driven, positive displacement, residential water meters that can collect data at a user configurable time resolution interval. The second component consists of a system developed using existing open-source software technologies that manages the data collected, including services and databasing. The final element includes software tools for retrieving the data that can be integrated with advanced data analytics tools. The system was used in a single family residential water use data collection case study to test the scalability and performance of its functionalities within our design constraints. Testing with a base system configuration, our results show that the system requires approximately six minutes to process a single day of data collected at a four second temporal resolution for 500 properties. Thus, the system proved to be effective beyond the typical number of participants observed in similar studies of residential water use and would scale well beyond this even with the modest system resources we used for testing. All elements of the cyberinfrastructure developed are freely available in open source repositories for re-use.</p>","PeriodicalId":12033,"journal":{"name":"Environ. Model. Softw.","volume":"28 1","pages":"105137"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environ. Model. Softw.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/EGUSPHERE-EGU21-6031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Collecting and managing high temporal resolution (< 1 minute) residential water use data is challenging due to cost and technical requirements associated with the volume and velocity of data collected. It is well known that this type of data has potential to expand our knowledge of residential water use, inform future water use predictions, and improve water conservation strategies. However, most studies collecting this type of data have been focused on the practical application of the data (e.g., developing and applying end use disaggregation algorithms) with much less focus on how the data were collected, retrieved, quality controlled, and managed to enable data visualization and analysis. We developed an open-source, modular, generalized cyberinfrastructure system to automate the process from data collection to analysis. The system has three main architectural components: first, the sensors and dataloggers for water use monitoring; second, the data communication, parsing and archival tools; and third, the analyses, visualization and presentations of data produced for different audiences. For the first component, we present a low-cost datalogging device, designed for installation on top of existing, analog, magnetically driven, positive displacement, residential water meters that can collect data at a user configurable time resolution interval. The second component consists of a system developed using existing open-source software technologies that manages the data collected, including services and databasing. The final element includes software tools for retrieving the data that can be integrated with advanced data analytics tools. The system was used in a single family residential water use data collection case study to test the scalability and performance of its functionalities within our design constraints. Testing with a base system configuration, our results show that the system requires approximately six minutes to process a single day of data collected at a four second temporal resolution for 500 properties. Thus, the system proved to be effective beyond the typical number of participants observed in similar studies of residential water use and would scale well beyond this even with the modest system resources we used for testing. All elements of the cyberinfrastructure developed are freely available in open source repositories for re-use.

用于收集、处理、存储和访问高时间分辨率住宅用水数据的开源网络基础设施
收集和管理高时间分辨率(< 1分钟)的住宅用水数据具有挑战性,因为收集数据的数量和速度与成本和技术要求相关。众所周知,这种类型的数据有可能扩大我们对住宅用水的了解,为未来的用水预测提供信息,并改进节水策略。然而,大多数收集这类数据的研究都集中在数据的实际应用上(例如,开发和应用最终用途分解算法),而很少关注如何收集、检索、质量控制和管理数据以实现数据可视化和分析。我们开发了一个开源的、模块化的、通用的网络基础设施系统,使从数据收集到分析的过程自动化。该系统有三个主要的架构组成部分:第一,用于监测用水的传感器和数据采集器;二是数据通信、解析和归档工具;第三,针对不同受众的数据分析、可视化和展示。对于第一个组件,我们提出了一种低成本的数据记录设备,设计用于安装在现有的模拟,磁驱动,正位移,住宅水表上,可以在用户可配置的时间分辨率间隔内收集数据。第二个组件由一个使用现有开源软件技术开发的系统组成,该系统管理收集的数据,包括服务和数据库。最后一个元素包括用于检索可与高级数据分析工具集成的数据的软件工具。该系统被用于一个单户住宅用水数据收集案例研究,以测试其功能在我们设计约束下的可扩展性和性能。使用基本系统配置进行测试,我们的结果表明,系统需要大约6分钟来处理以4秒时间分辨率为500个属性收集的一天数据。因此,该系统被证明是有效的,超出了在类似的住宅用水研究中观察到的典型参与者数量,并且即使我们用于测试的适度系统资源也将远远超出这一范围。开发的网络基础设施的所有元素都可以在开放源代码存储库中免费获得,以供重用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信