Categorization of crowd-sensing streaming data for contextual characteristic detection

J. Smart Cities Soc. Pub Date : 2023-08-17 DOI:10.3233/scs-230013

Philipp Kisters, Hanno Schreiber, Janick Edinger

{"title":"Categorization of crowd-sensing streaming data for contextual characteristic detection","authors":"Philipp Kisters, Hanno Schreiber, Janick Edinger","doi":"10.3233/scs-230013","DOIUrl":null,"url":null,"abstract":"The growing reliance on large wireless sensor networks, potentially consisting of hundreds of nodes, to monitor real-world phenomena inevitably results in large, complex datasets that become increasingly difficult to process using traditional methods. The inadvertent inclusion of anomalies in the dataset, resulting from the inherent characteristics of these networks, makes it difficult to isolate interesting events from erroneous measurements. Simultaneously, improvements in data science methods, as well as increased accessibility to powerful computers, lead to these techniques becoming more applicable to everyday data mining problems. In addition to being able to process large amounts of complex streaming data, a wide array of specialized data science methods enables complex analysis not possible using traditional techniques. Using real-world streaming data gathered by a temperature sensor network consisting of approximately 600 nodes, various data science methods were analyzed for their ability to exploit implicit dependencies embedded in unlabelled data to solve the complex task to identify contextual characteristics. The methods identified during this analysis were included in the construction of a software pipeline. The constructed pipeline reduced the identification of characteristics in the dataset to a trivial task, the application of which led to the detection of various characteristics describing the context in which sensors are deployed.","PeriodicalId":299673,"journal":{"name":"J. Smart Cities Soc.","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Smart Cities Soc.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/scs-230013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The growing reliance on large wireless sensor networks, potentially consisting of hundreds of nodes, to monitor real-world phenomena inevitably results in large, complex datasets that become increasingly difficult to process using traditional methods. The inadvertent inclusion of anomalies in the dataset, resulting from the inherent characteristics of these networks, makes it difficult to isolate interesting events from erroneous measurements. Simultaneously, improvements in data science methods, as well as increased accessibility to powerful computers, lead to these techniques becoming more applicable to everyday data mining problems. In addition to being able to process large amounts of complex streaming data, a wide array of specialized data science methods enables complex analysis not possible using traditional techniques. Using real-world streaming data gathered by a temperature sensor network consisting of approximately 600 nodes, various data science methods were analyzed for their ability to exploit implicit dependencies embedded in unlabelled data to solve the complex task to identify contextual characteristics. The methods identified during this analysis were included in the construction of a software pipeline. The constructed pipeline reduced the identification of characteristics in the dataset to a trivial task, the application of which led to the detection of various characteristics describing the context in which sensors are deployed.

查看原文本刊更多论文

基于上下文特征检测的人群传感流数据分类

越来越多地依赖大型无线传感器网络(可能由数百个节点组成)来监测现实世界的现象，不可避免地导致大型、复杂的数据集，这些数据集越来越难以使用传统方法进行处理。由于这些网络的固有特征，数据集中无意中包含了异常，这使得很难从错误的测量中分离出有趣的事件。同时，数据科学方法的改进，以及强大计算机的可访问性的增加，导致这些技术变得更适用于日常数据挖掘问题。除了能够处理大量复杂的流数据之外，各种专门的数据科学方法还可以实现使用传统技术无法实现的复杂分析。使用由大约600个节点组成的温度传感器网络收集的真实世界流数据，分析了各种数据科学方法利用嵌入在未标记数据中的隐式依赖关系来解决识别上下文特征的复杂任务的能力。在此分析过程中确定的方法包括在软件管道的构建中。构建的管道将数据集中特征的识别减少到一个微不足道的任务，其应用导致检测描述传感器部署环境的各种特征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

J. Smart Cities Soc.

自引率

0.00%

发文量