基于常规聚类发现的泛在传感器数据行为分析

2020 International Conference on COMmunication Systems & NETworkS (COMSNETS) Pub Date : 2020-01-01 DOI:10.1109/COMSNETS48256.2020.9027377

Manan Sharma, Shivam Tiwari, Suchetana Chakraborty, D. Banerjee

{"title":"基于常规聚类发现的泛在传感器数据行为分析","authors":"Manan Sharma, Shivam Tiwari, Suchetana Chakraborty, D. Banerjee","doi":"10.1109/COMSNETS48256.2020.9027377","DOIUrl":null,"url":null,"abstract":"Behavioral analysis (BA) on ubiquitous sensor data is the task of finding the latent distribution of features for modeling user-specific characteristics. These characteristics, in turn, can be used for a number of tasks including resource management, power efficiency, and smart home applications. In recent years, the employment of topic models for BA has been found to successfully extract the dynamics of the sensed data. Topic modeling is popularly performed on text data for mining inherent topics. The task of finding the latent topics in textual data is done in an unsupervised manner. In this work we propose a novel clustering technique for BA which can find hidden routines in ubiquitous data and also captures the pattern in the routines. Our approach efficiently works on high dimensional data for BA without performing any computationally expensive reduction operations. We evaluate three different techniques namely LDA, the Non-negative Matrix Factorization (NMF) and the Probabilistic Latent Semantic Analysis (PLSA) for comparative study. We have analyzed the efficiency of the methods by using performance indices like perplexity and silhouette on three real-world ubiquitous sensor datasets namely, the Intel Lab Data, Kyoto Data, and MERL data. Through rigorous experiments, we achieve silhouette scores of 0.7049 over the Intel Lab dataset, 0.6547 over the Kyoto dataset and 0.8312 over the MERL dataset for clustering.","PeriodicalId":265871,"journal":{"name":"2020 International Conference on COMmunication Systems & NETworkS (COMSNETS)","volume":"1991 9","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Behavior Analysis through Routine Cluster Discovery in Ubiquitous Sensor Data\",\"authors\":\"Manan Sharma, Shivam Tiwari, Suchetana Chakraborty, D. Banerjee\",\"doi\":\"10.1109/COMSNETS48256.2020.9027377\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Behavioral analysis (BA) on ubiquitous sensor data is the task of finding the latent distribution of features for modeling user-specific characteristics. These characteristics, in turn, can be used for a number of tasks including resource management, power efficiency, and smart home applications. In recent years, the employment of topic models for BA has been found to successfully extract the dynamics of the sensed data. Topic modeling is popularly performed on text data for mining inherent topics. The task of finding the latent topics in textual data is done in an unsupervised manner. In this work we propose a novel clustering technique for BA which can find hidden routines in ubiquitous data and also captures the pattern in the routines. Our approach efficiently works on high dimensional data for BA without performing any computationally expensive reduction operations. We evaluate three different techniques namely LDA, the Non-negative Matrix Factorization (NMF) and the Probabilistic Latent Semantic Analysis (PLSA) for comparative study. We have analyzed the efficiency of the methods by using performance indices like perplexity and silhouette on three real-world ubiquitous sensor datasets namely, the Intel Lab Data, Kyoto Data, and MERL data. Through rigorous experiments, we achieve silhouette scores of 0.7049 over the Intel Lab dataset, 0.6547 over the Kyoto dataset and 0.8312 over the MERL dataset for clustering.\",\"PeriodicalId\":265871,\"journal\":{\"name\":\"2020 International Conference on COMmunication Systems & NETworkS (COMSNETS)\",\"volume\":\"1991 9\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on COMmunication Systems & NETworkS (COMSNETS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COMSNETS48256.2020.9027377\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on COMmunication Systems & NETworkS (COMSNETS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMSNETS48256.2020.9027377","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

对无处不在的传感器数据进行行为分析(BA)的任务是找到特征的潜在分布，从而对用户特定的特征进行建模。反过来，这些特性可以用于许多任务，包括资源管理、电源效率和智能家居应用。近年来，人们发现利用主题模型成功地提取了感知数据的动态特征。主题建模通常是在文本数据上进行的，用于挖掘固有主题。在文本数据中发现潜在主题的任务是以一种无监督的方式完成的。在本文中，我们提出了一种新的BA聚类技术，它可以在无处不在的数据中发现隐藏的例程，并捕获例程中的模式。我们的方法可以有效地处理BA的高维数据，而无需执行任何计算上昂贵的约简操作。我们评估了三种不同的技术，即LDA，非负矩阵分解(NMF)和概率潜在语义分析(PLSA)进行比较研究。我们在英特尔实验室数据、京都数据和MERL数据三个真实世界的无所不在传感器数据集上使用困惑度和轮廓等性能指标分析了这些方法的效率。通过严格的实验，我们在英特尔实验室数据集上获得了0.7049的轮廓分数，在京都数据集上获得了0.6547的轮廓分数，在MERL数据集上获得了0.8312的轮廓分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Behavior Analysis through Routine Cluster Discovery in Ubiquitous Sensor Data

Behavioral analysis (BA) on ubiquitous sensor data is the task of finding the latent distribution of features for modeling user-specific characteristics. These characteristics, in turn, can be used for a number of tasks including resource management, power efficiency, and smart home applications. In recent years, the employment of topic models for BA has been found to successfully extract the dynamics of the sensed data. Topic modeling is popularly performed on text data for mining inherent topics. The task of finding the latent topics in textual data is done in an unsupervised manner. In this work we propose a novel clustering technique for BA which can find hidden routines in ubiquitous data and also captures the pattern in the routines. Our approach efficiently works on high dimensional data for BA without performing any computationally expensive reduction operations. We evaluate three different techniques namely LDA, the Non-negative Matrix Factorization (NMF) and the Probabilistic Latent Semantic Analysis (PLSA) for comparative study. We have analyzed the efficiency of the methods by using performance indices like perplexity and silhouette on three real-world ubiquitous sensor datasets namely, the Intel Lab Data, Kyoto Data, and MERL data. Through rigorous experiments, we achieve silhouette scores of 0.7049 over the Intel Lab dataset, 0.6547 over the Kyoto dataset and 0.8312 over the MERL dataset for clustering.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 International Conference on COMmunication Systems & NETworkS (COMSNETS)

自引率

0.00%

发文量