{"title":"基于软二值化训练的高效神经网络语言模型压缩","authors":"Rao Ma, Qi Liu, Kai Yu","doi":"10.1109/ASRU46091.2019.9003744","DOIUrl":null,"url":null,"abstract":"The long short-term memory language model (LSTM LM) has been widely investigated in large vocabulary continuous speech recognition (LVCSR) task. Despite the excellent performance of LSTM LM, its usage in resource-constrained environments, such as portable devices, is limited due to the high consumption of memory. Binarized language model has been proposed to achieve significant memory reduction at the cost of performance degradation at high compression ratio. In this paper, we propose a soft binarization approach to recover the performance of binarized LSTM LM. Experiments show that the proposed method can achieve a high compression rate of 30 × with almost no performance loss in both language modeling and speech recognition tasks.","PeriodicalId":150913,"journal":{"name":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","volume":"336 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Highly Efficient Neural Network Language Model Compression Using Soft Binarization Training\",\"authors\":\"Rao Ma, Qi Liu, Kai Yu\",\"doi\":\"10.1109/ASRU46091.2019.9003744\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The long short-term memory language model (LSTM LM) has been widely investigated in large vocabulary continuous speech recognition (LVCSR) task. Despite the excellent performance of LSTM LM, its usage in resource-constrained environments, such as portable devices, is limited due to the high consumption of memory. Binarized language model has been proposed to achieve significant memory reduction at the cost of performance degradation at high compression ratio. In this paper, we propose a soft binarization approach to recover the performance of binarized LSTM LM. Experiments show that the proposed method can achieve a high compression rate of 30 × with almost no performance loss in both language modeling and speech recognition tasks.\",\"PeriodicalId\":150913,\"journal\":{\"name\":\"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)\",\"volume\":\"336 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU46091.2019.9003744\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU46091.2019.9003744","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Highly Efficient Neural Network Language Model Compression Using Soft Binarization Training
The long short-term memory language model (LSTM LM) has been widely investigated in large vocabulary continuous speech recognition (LVCSR) task. Despite the excellent performance of LSTM LM, its usage in resource-constrained environments, such as portable devices, is limited due to the high consumption of memory. Binarized language model has been proposed to achieve significant memory reduction at the cost of performance degradation at high compression ratio. In this paper, we propose a soft binarization approach to recover the performance of binarized LSTM LM. Experiments show that the proposed method can achieve a high compression rate of 30 × with almost no performance loss in both language modeling and speech recognition tasks.