压缩稀疏张量的张量矩阵积

Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms Pub Date : 2015-11-15 DOI:10.1145/2833179.2833183

Shaden Smith, G. Karypis

{"title":"压缩稀疏张量的张量矩阵积","authors":"Shaden Smith, G. Karypis","doi":"10.1145/2833179.2833183","DOIUrl":null,"url":null,"abstract":"The Canonical Polyadic Decomposition (CPD) of tensors is a powerful tool for analyzing multi-way data and is used extensively to analyze very large and extremely sparse datasets. The bottleneck of computing the CPD is multiplying a sparse tensor by several dense matrices. Algorithms for tensor-matrix products fall into two classes. The first class saves floating point operations by storing a compressed tensor for each dimension of the data. These methods are fast but suffer high memory costs. The second class uses a single uncompressed tensor at the cost of additional floating point operations. In this work, we bridge the gap between the two approaches and introduce the compressed sparse fiber (CSF) a data structure for sparse tensors along with a novel parallel algorithm for tensor-matrix multiplication. CSF offers similar operation reductions as existing compressed methods while using only a single tensor structure. We validate our contributions with experiments comparing against state-of-the-art methods on a diverse set of datasets. Our work uses 58% less memory than the state-of-the-art while achieving 81% of the parallel performance on 16 threads.","PeriodicalId":215872,"journal":{"name":"Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"111","resultStr":"{\"title\":\"Tensor-matrix products with a compressed sparse tensor\",\"authors\":\"Shaden Smith, G. Karypis\",\"doi\":\"10.1145/2833179.2833183\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Canonical Polyadic Decomposition (CPD) of tensors is a powerful tool for analyzing multi-way data and is used extensively to analyze very large and extremely sparse datasets. The bottleneck of computing the CPD is multiplying a sparse tensor by several dense matrices. Algorithms for tensor-matrix products fall into two classes. The first class saves floating point operations by storing a compressed tensor for each dimension of the data. These methods are fast but suffer high memory costs. The second class uses a single uncompressed tensor at the cost of additional floating point operations. In this work, we bridge the gap between the two approaches and introduce the compressed sparse fiber (CSF) a data structure for sparse tensors along with a novel parallel algorithm for tensor-matrix multiplication. CSF offers similar operation reductions as existing compressed methods while using only a single tensor structure. We validate our contributions with experiments comparing against state-of-the-art methods on a diverse set of datasets. Our work uses 58% less memory than the state-of-the-art while achieving 81% of the parallel performance on 16 threads.\",\"PeriodicalId\":215872,\"journal\":{\"name\":\"Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"111\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2833179.2833183\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2833179.2833183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 111

摘要

张量的正则多进分解(CPD)是一种强大的多路数据分析工具，被广泛用于分析超大型和极稀疏的数据集。计算CPD的瓶颈是一个稀疏张量乘以几个密集矩阵。张量-矩阵乘积的算法分为两类。第一类通过为数据的每个维度存储压缩张量来节省浮点操作。这些方法速度快，但内存成本高。第二个类使用单个未压缩张量，代价是额外的浮点操作。在这项工作中，我们弥合了两种方法之间的差距，并引入了压缩稀疏纤维(CSF)一种稀疏张量的数据结构以及一种新的张量-矩阵乘法并行算法。CSF提供了与现有压缩方法类似的操作减少，同时仅使用单个张量结构。我们通过实验来验证我们的贡献，并在不同的数据集上与最先进的方法进行比较。我们的工作使用的内存比最先进的少58%，同时在16个线程上实现了81%的并行性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Tensor-matrix products with a compressed sparse tensor

The Canonical Polyadic Decomposition (CPD) of tensors is a powerful tool for analyzing multi-way data and is used extensively to analyze very large and extremely sparse datasets. The bottleneck of computing the CPD is multiplying a sparse tensor by several dense matrices. Algorithms for tensor-matrix products fall into two classes. The first class saves floating point operations by storing a compressed tensor for each dimension of the data. These methods are fast but suffer high memory costs. The second class uses a single uncompressed tensor at the cost of additional floating point operations. In this work, we bridge the gap between the two approaches and introduce the compressed sparse fiber (CSF) a data structure for sparse tensors along with a novel parallel algorithm for tensor-matrix multiplication. CSF offers similar operation reductions as existing compressed methods while using only a single tensor structure. We validate our contributions with experiments comparing against state-of-the-art methods on a diverse set of datasets. Our work uses 58% less memory than the state-of-the-art while achieving 81% of the parallel performance on 16 threads.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms

自引率

0.00%

发文量