{"title":"K-Means Algorithm Over Compressed Binary Data","authors":"Elsa Dupraz","doi":"10.1109/DCC.2018.00060","DOIUrl":"https://doi.org/10.1109/DCC.2018.00060","url":null,"abstract":"We consider a network of binary-valued sensors with a fusion center. The fusion center has to perform K-means clustering on the binary data transmitted by the sensors. In order to reduce the amount of data transmitted within the network, the sensors compress their data with a source coding scheme based on binary sparse matrices. We propose to apply the K-means algorithm directly over the compressed data without reconstructing the original sensors measurements, in order to avoid potentially complex decoding operations. We provide approximated expressions of the error probabilities of the K-means steps in the compressed domain. From these expressions, we show that applying the K-means algorithm in the compressed domain enables to recover the clusters of the original domain. Monte Carlo simulations illustrate the accuracy of the obtained approximated error probabilities, and show that the coding rate needed to perform K-means clustering in the compressed domain is lower than the rate needed to reconstruct all the measurements.","PeriodicalId":137206,"journal":{"name":"2018 Data Compression Conference","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125160408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal In-Place Suffix Sorting","authors":"Zhize Li, Jian Li, Hongwei Huo","doi":"10.1109/DCC.2018.00075","DOIUrl":"https://doi.org/10.1109/DCC.2018.00075","url":null,"abstract":"Suffix array is a fundamental data structure for many applications that involve string searching and data compression. We obtain the emph{first} linear time in-place suffix array construction algorithm which is optimal both in time and space for read-only integer alphabets. Our algorithm settles the open problem posed by [Franceschini and Muthukrishnan, ICALP'07]. The open problem asked to design in-place algorithms in o(nlog n) time and ultimately, in O(n) time for integer alphabets with |ς|≤ n. Our result is in fact slightly stronger since we allow |ς|=O(n). Besides, we extend it to obtain an optimal O(nlog n) time in-place suffix sorting algorithm for read-only general alphabets (i.e., only comparisons are allowed).","PeriodicalId":137206,"journal":{"name":"2018 Data Compression Conference","volume":"46 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120895597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}