微软语音研究中深度学习的最新进展

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI:10.1109/ICASSP.2013.6639345

L. Deng, Jinyu Li, J. Huang, K. Yao, Dong Yu, F. Seide, M. Seltzer, G. Zweig, Xiaodong He, J. Williams, Y. Gong, A. Acero

{"title":"微软语音研究中深度学习的最新进展","authors":"L. Deng, Jinyu Li, J. Huang, K. Yao, Dong Yu, F. Seide, M. Seltzer, G. Zweig, Xiaodong He, J. Williams, Y. Gong, A. Acero","doi":"10.1109/ICASSP.2013.6639345","DOIUrl":null,"url":null,"abstract":"Deep learning is becoming a mainstream technology for speech recognition at industrial scale. In this paper, we provide an overview of the work by Microsoft speech researchers since 2009 in this area, focusing on more recent advances which shed light to the basic capabilities and limitations of the current deep learning technology. We organize this overview along the feature-domain and model-domain dimensions according to the conventional approach to analyzing speech systems. Selected experimental results, including speech recognition and related applications such as spoken dialogue and language modeling, are presented to demonstrate and analyze the strengths and weaknesses of the techniques described in the paper. Potential improvement of these techniques and future research directions are discussed.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"770","resultStr":"{\"title\":\"Recent advances in deep learning for speech research at Microsoft\",\"authors\":\"L. Deng, Jinyu Li, J. Huang, K. Yao, Dong Yu, F. Seide, M. Seltzer, G. Zweig, Xiaodong He, J. Williams, Y. Gong, A. Acero\",\"doi\":\"10.1109/ICASSP.2013.6639345\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning is becoming a mainstream technology for speech recognition at industrial scale. In this paper, we provide an overview of the work by Microsoft speech researchers since 2009 in this area, focusing on more recent advances which shed light to the basic capabilities and limitations of the current deep learning technology. We organize this overview along the feature-domain and model-domain dimensions according to the conventional approach to analyzing speech systems. Selected experimental results, including speech recognition and related applications such as spoken dialogue and language modeling, are presented to demonstrate and analyze the strengths and weaknesses of the techniques described in the paper. Potential improvement of these techniques and future research directions are discussed.\",\"PeriodicalId\":183968,\"journal\":{\"name\":\"2013 IEEE International Conference on Acoustics, Speech and Signal Processing\",\"volume\":\"101 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-05-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"770\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE International Conference on Acoustics, Speech and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2013.6639345\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2013.6639345","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 770

摘要

深度学习正在成为工业规模语音识别的主流技术。在本文中，我们概述了微软语音研究人员自2009年以来在这一领域的工作，重点介绍了当前深度学习技术的基本能力和局限性的最新进展。我们根据分析语音系统的传统方法，沿着特征域和模型域维度组织这一概述。本文给出了一些实验结果，包括语音识别和相关应用，如口语对话和语言建模，以展示和分析本文所述技术的优缺点。讨论了这些技术的改进潜力和未来的研究方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Recent advances in deep learning for speech research at Microsoft

Deep learning is becoming a mainstream technology for speech recognition at industrial scale. In this paper, we provide an overview of the work by Microsoft speech researchers since 2009 in this area, focusing on more recent advances which shed light to the basic capabilities and limitations of the current deep learning technology. We organize this overview along the feature-domain and model-domain dimensions according to the conventional approach to analyzing speech systems. Selected experimental results, including speech recognition and related applications such as spoken dialogue and language modeling, are presented to demonstrate and analyze the strengths and weaknesses of the techniques described in the paper. Potential improvement of these techniques and future research directions are discussed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE International Conference on Acoustics, Speech and Signal Processing

自引率

0.00%

发文量