Pei-Lun Suei, Che-Wei Kuo, R. Luoh, Tei-Wei Kuo, C. Shih, Min-Siong Liang
{"title":"基于COTS DBMS的大规模传感器数据压缩与查询","authors":"Pei-Lun Suei, Che-Wei Kuo, R. Luoh, Tei-Wei Kuo, C. Shih, Min-Siong Liang","doi":"10.1109/ETFA.2010.5641312","DOIUrl":null,"url":null,"abstract":"Multi-dimensional temporal data set is the common format in sensor network applications to store sampled temporal data. As time goes on, the size of the core tables in the data set may increase to enormous size and the tables become not managable. In order to reduce storage space and allow on-line query, how to trade off data compression effectiveness for on-line query performance is a challenge issue. In this paper, we are concerned with an effective framework for temporal data set that does not scarify on-line query performance and is specifically designed for very large sensor network database. The sampled data are compressed using several candidate approaches including dictionary-base compress and lossless vector quantization. In the mean time, on-line queries are conducted without decompressing the compressed data set so as to enhance the query performance. Experiments are conducted on a power meter database and sonoma database to evaluate the proposed methodologies in terms of data compression rate and data query speed. The results show that the compression rate ranges from 70% for numerical data to 20% for character data. In the mean time, the increased overhead for online query is limited up to 2%.","PeriodicalId":201440,"journal":{"name":"2010 IEEE 15th Conference on Emerging Technologies & Factory Automation (ETFA 2010)","volume":"169 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Data compression and query for large scale sensor data on COTS DBMS\",\"authors\":\"Pei-Lun Suei, Che-Wei Kuo, R. Luoh, Tei-Wei Kuo, C. Shih, Min-Siong Liang\",\"doi\":\"10.1109/ETFA.2010.5641312\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-dimensional temporal data set is the common format in sensor network applications to store sampled temporal data. As time goes on, the size of the core tables in the data set may increase to enormous size and the tables become not managable. In order to reduce storage space and allow on-line query, how to trade off data compression effectiveness for on-line query performance is a challenge issue. In this paper, we are concerned with an effective framework for temporal data set that does not scarify on-line query performance and is specifically designed for very large sensor network database. The sampled data are compressed using several candidate approaches including dictionary-base compress and lossless vector quantization. In the mean time, on-line queries are conducted without decompressing the compressed data set so as to enhance the query performance. Experiments are conducted on a power meter database and sonoma database to evaluate the proposed methodologies in terms of data compression rate and data query speed. The results show that the compression rate ranges from 70% for numerical data to 20% for character data. In the mean time, the increased overhead for online query is limited up to 2%.\",\"PeriodicalId\":201440,\"journal\":{\"name\":\"2010 IEEE 15th Conference on Emerging Technologies & Factory Automation (ETFA 2010)\",\"volume\":\"169 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE 15th Conference on Emerging Technologies & Factory Automation (ETFA 2010)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ETFA.2010.5641312\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 15th Conference on Emerging Technologies & Factory Automation (ETFA 2010)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ETFA.2010.5641312","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data compression and query for large scale sensor data on COTS DBMS
Multi-dimensional temporal data set is the common format in sensor network applications to store sampled temporal data. As time goes on, the size of the core tables in the data set may increase to enormous size and the tables become not managable. In order to reduce storage space and allow on-line query, how to trade off data compression effectiveness for on-line query performance is a challenge issue. In this paper, we are concerned with an effective framework for temporal data set that does not scarify on-line query performance and is specifically designed for very large sensor network database. The sampled data are compressed using several candidate approaches including dictionary-base compress and lossless vector quantization. In the mean time, on-line queries are conducted without decompressing the compressed data set so as to enhance the query performance. Experiments are conducted on a power meter database and sonoma database to evaluate the proposed methodologies in terms of data compression rate and data query speed. The results show that the compression rate ranges from 70% for numerical data to 20% for character data. In the mean time, the increased overhead for online query is limited up to 2%.