{"title":"Machine learning-assisted collection of reduced sensor data for improved analytics pipeline","authors":"Ankur Verma , Ayush Goyal , Soundar Kumara","doi":"10.1016/j.procir.2023.09.242","DOIUrl":null,"url":null,"abstract":"<div><p>Sensor data is increasingly offering better operational visibility. However, the data deluge is also posing cost and complexity challenges on the data analytics pipeline, which comprises of edge computing, power, transmission, and storage for data-driven decision making. To address the data deluge problem, we propose a machine learning assisted approach of collecting less data upfront to solve different sensor data analytics problems. While sampling at Nyquist rates, we do not collect every data point, but rather sample according to the information content in the signal. A comprehensive experimental design is undertaken to show that collecting more than a certain fraction of raw data only leads to infinitesimal performance improvements. The engineering advantages of the proposed near real-time approach are quantified showing a significant reduction in analytics pipeline resources required for industrial digital transformation applications.</p></div>","PeriodicalId":20535,"journal":{"name":"Procedia CIRP","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2212827123009617/pdf?md5=0103a6afa4481ff1f411d1a633c83f2e&pid=1-s2.0-S2212827123009617-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Procedia CIRP","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2212827123009617","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Sensor data is increasingly offering better operational visibility. However, the data deluge is also posing cost and complexity challenges on the data analytics pipeline, which comprises of edge computing, power, transmission, and storage for data-driven decision making. To address the data deluge problem, we propose a machine learning assisted approach of collecting less data upfront to solve different sensor data analytics problems. While sampling at Nyquist rates, we do not collect every data point, but rather sample according to the information content in the signal. A comprehensive experimental design is undertaken to show that collecting more than a certain fraction of raw data only leads to infinitesimal performance improvements. The engineering advantages of the proposed near real-time approach are quantified showing a significant reduction in analytics pipeline resources required for industrial digital transformation applications.