Chi-Cheng Chuang, Y. Chiu, Zhi-Hung Chen, Hao-Ping Kang, Che-Rung Lee
{"title":"智能电网数据库系统中波动数据的压缩算法","authors":"Chi-Cheng Chuang, Y. Chiu, Zhi-Hung Chen, Hao-Ping Kang, Che-Rung Lee","doi":"10.1109/DCC.2013.67","DOIUrl":null,"url":null,"abstract":"In this paper, we present a lossless compression algorithm for fluctuant data, which can be integrated into database system and allows regular database insertion and queries. The algorithm is based on the observation that fluctuant data, although varied violently during small time intervals, have similar patterns over time. The algorithm first partitioned consecutive k records into segments. Those segments are normalized and treated as vectors in k-dimensional space. Classification algorithms are then applied to find representative vectors for those normalized vectors. The classification criterion is that any segments after normalization can find at least one representative vector such that their distance is less than a given threshold. Those representative vectors, called codes, are stored in a codebook. The codebook can be generated offline from a small training dataset, and used repeatedly. The online compression algorithm searches the nearest code for an input segment, and stores only the ID of the code and their difference. Since the difference is small, it can be compressed by Rice coding or Golomb coding.lossless compression algorithm.","PeriodicalId":388717,"journal":{"name":"2013 Data Compression Conference","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Compression Algorithm for Fluctuant Data in Smart Grid Database Systems\",\"authors\":\"Chi-Cheng Chuang, Y. Chiu, Zhi-Hung Chen, Hao-Ping Kang, Che-Rung Lee\",\"doi\":\"10.1109/DCC.2013.67\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a lossless compression algorithm for fluctuant data, which can be integrated into database system and allows regular database insertion and queries. The algorithm is based on the observation that fluctuant data, although varied violently during small time intervals, have similar patterns over time. The algorithm first partitioned consecutive k records into segments. Those segments are normalized and treated as vectors in k-dimensional space. Classification algorithms are then applied to find representative vectors for those normalized vectors. The classification criterion is that any segments after normalization can find at least one representative vector such that their distance is less than a given threshold. Those representative vectors, called codes, are stored in a codebook. The codebook can be generated offline from a small training dataset, and used repeatedly. The online compression algorithm searches the nearest code for an input segment, and stores only the ID of the code and their difference. Since the difference is small, it can be compressed by Rice coding or Golomb coding.lossless compression algorithm.\",\"PeriodicalId\":388717,\"journal\":{\"name\":\"2013 Data Compression Conference\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 Data Compression Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DCC.2013.67\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Data Compression Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2013.67","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Compression Algorithm for Fluctuant Data in Smart Grid Database Systems
In this paper, we present a lossless compression algorithm for fluctuant data, which can be integrated into database system and allows regular database insertion and queries. The algorithm is based on the observation that fluctuant data, although varied violently during small time intervals, have similar patterns over time. The algorithm first partitioned consecutive k records into segments. Those segments are normalized and treated as vectors in k-dimensional space. Classification algorithms are then applied to find representative vectors for those normalized vectors. The classification criterion is that any segments after normalization can find at least one representative vector such that their distance is less than a given threshold. Those representative vectors, called codes, are stored in a codebook. The codebook can be generated offline from a small training dataset, and used repeatedly. The online compression algorithm searches the nearest code for an input segment, and stores only the ID of the code and their difference. Since the difference is small, it can be compressed by Rice coding or Golomb coding.lossless compression algorithm.