Devise Sparse Compression Schedulers to Enhance FastText Methods

Chen-Ting Chao, Wei-Hsu Chu, Chao-Lin Lee, Jenq-Kuen Lee, Ming-Yu Hung, Hsiang-Wei Sung
{"title":"Devise Sparse Compression Schedulers to Enhance FastText Methods","authors":"Chen-Ting Chao, Wei-Hsu Chu, Chao-Lin Lee, Jenq-Kuen Lee, Ming-Yu Hung, Hsiang-Wei Sung","doi":"10.1145/3409390.3409394","DOIUrl":null,"url":null,"abstract":"In natural language processing(NLP), the general way to understand the meaning of a word is via word embedding. The word embedding training model can convert words into multidimensional vectors and make the words that do not know “meaning” into vectors with “meaning”. Famous word embedding training models, include models such as FastText, Word2Vec, and GloVe. They can train words into vectors and then they are used for further semantic classifications. In this paper, we work on the efficient support for the FastText. FastText is an open source library created by Facebook(FAIR) lab that allows users to learn word embedding and text classification. We focus on the word representation application in FastText, in which general matrix-Vector multiplication(GEMV) is one of the most computationally intensive operations. In this paper, we adjust the software architecture of FastText, and pre-process the pre-trained model offline. In addition, we introduce a new accelerating method with sparse matrix compression in Halide, which improves performance by compressing the matrix. Our support with Halide sparse compression schedulers include hybrid compression schemes and re-ordering methods to improve the performance.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"177 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop Proceedings of the 49th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3409390.3409394","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In natural language processing(NLP), the general way to understand the meaning of a word is via word embedding. The word embedding training model can convert words into multidimensional vectors and make the words that do not know “meaning” into vectors with “meaning”. Famous word embedding training models, include models such as FastText, Word2Vec, and GloVe. They can train words into vectors and then they are used for further semantic classifications. In this paper, we work on the efficient support for the FastText. FastText is an open source library created by Facebook(FAIR) lab that allows users to learn word embedding and text classification. We focus on the word representation application in FastText, in which general matrix-Vector multiplication(GEMV) is one of the most computationally intensive operations. In this paper, we adjust the software architecture of FastText, and pre-process the pre-trained model offline. In addition, we introduce a new accelerating method with sparse matrix compression in Halide, which improves performance by compressing the matrix. Our support with Halide sparse compression schedulers include hybrid compression schemes and re-ordering methods to improve the performance.
设计稀疏压缩调度器来增强FastText方法
在自然语言处理(NLP)中,理解一个词的意思的一般方法是通过词嵌入。单词嵌入训练模型可以将单词转化为多维向量,将不知道“意思”的单词转化为有“意思”的向量。著名的词嵌入训练模型包括FastText、Word2Vec、GloVe等模型。它们可以将单词训练成向量,然后用于进一步的语义分类。在本文中,我们致力于快速文本的有效支持。FastText是由Facebook(FAIR)实验室创建的一个开源库,允许用户学习单词嵌入和文本分类。本文重点研究了FastText中的单词表示应用,其中一般矩阵向量乘法(GEMV)是计算量最大的运算之一。本文对FastText的软件架构进行调整,并对预训练好的模型进行离线预处理。此外,我们在Halide中引入了一种新的稀疏矩阵压缩加速方法,通过压缩矩阵来提高性能。我们对Halide稀疏压缩调度器的支持包括混合压缩方案和重新排序方法,以提高性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信