Compression-Aware Algorithms for Massive Datasets

Nathan Brunelle, G. Robins, Abhi Shelat
{"title":"Compression-Aware Algorithms for Massive Datasets","authors":"Nathan Brunelle, G. Robins, Abhi Shelat","doi":"10.1109/DCC.2015.74","DOIUrl":null,"url":null,"abstract":"While massive datasets are often stored in compressed format, most algorithms are designed to operate on uncompressed data. We address this growing disconnect by developing a framework for compression-aware algorithms that operate directly on compressed datasets. Synergistically, we also propose new algorithmically-aware compression schemes that enable algorithms to efficiently process the compressed data. In particular, we apply this general methodology to geometric / CAD datasets that are ubiquitous in areas such as graphics, VLSI, and geographic information systems. We develop example algorithms and corresponding compression schemes that address different types of datasets, including point sets and graphs. Our methods are more efficient than their classical counterparts, and they extend to both lossless and lossy compression scenarios. This motivates further investigation of how this approach can enable algorithms to process ever-increasing big data volumes.","PeriodicalId":313156,"journal":{"name":"2015 Data Compression Conference","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Data Compression Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2015.74","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

While massive datasets are often stored in compressed format, most algorithms are designed to operate on uncompressed data. We address this growing disconnect by developing a framework for compression-aware algorithms that operate directly on compressed datasets. Synergistically, we also propose new algorithmically-aware compression schemes that enable algorithms to efficiently process the compressed data. In particular, we apply this general methodology to geometric / CAD datasets that are ubiquitous in areas such as graphics, VLSI, and geographic information systems. We develop example algorithms and corresponding compression schemes that address different types of datasets, including point sets and graphs. Our methods are more efficient than their classical counterparts, and they extend to both lossless and lossy compression scenarios. This motivates further investigation of how this approach can enable algorithms to process ever-increasing big data volumes.
面向海量数据集的压缩感知算法
虽然大量数据集通常以压缩格式存储,但大多数算法都是针对未压缩数据设计的。我们通过开发一个直接在压缩数据集上操作的压缩感知算法框架来解决这种日益增长的脱节。此外,我们还提出了新的算法感知压缩方案,使算法能够有效地处理压缩数据。特别是,我们将这种通用方法应用于几何/ CAD数据集,这些数据集在图形、VLSI和地理信息系统等领域无处不在。我们开发了示例算法和相应的压缩方案,以解决不同类型的数据集,包括点集和图。我们的方法比经典的方法更有效,并且它们可以扩展到无损和有损压缩场景。这激发了对这种方法如何使算法能够处理不断增长的大数据量的进一步研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信