结合趋势数据的基于云的语音识别引擎的动态改进

2016 4th IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud) Pub Date : 2016-03-01 DOI:10.1109/MobileCloud.2016.12

Milind Bhavsar, Prudhvi Kosaraju, G. Ananthakrishnan, Gurudas Subray Shet, Saurav Anand

{"title":"结合趋势数据的基于云的语音识别引擎的动态改进","authors":"Milind Bhavsar, Prudhvi Kosaraju, G. Ananthakrishnan, Gurudas Subray Shet, Saurav Anand","doi":"10.1109/MobileCloud.2016.12","DOIUrl":null,"url":null,"abstract":"With the advancement of speech recognition technologies, there is an increase in the adoption of voice interfaces on mobile-based platforms. While, developing a general purpose Automatic Speech Recognition (ASR) which can understand voice commands is important, the contexts of how people interact with their mobile device change very rapidly. Due to the high processing complexity of the ASR engine, much of the processing of trending data is being carried out on cloudplatforms. Changed content regarding news, music, movies and TV series change the focus of interaction with voice based interfaces. Hence ASR engines trained on a static vocabulary may not be able to adapt to the changing contexts. The focus of this paper is to first describe the problems faced in incorporating dynamically changing vocabulary and contexts into an ASR engine. We then propose a novel solution which shows a relative improvement of 38 percent utterance accuracy on newly added content without compromising on the overall accuracy and stability of the system.","PeriodicalId":176270,"journal":{"name":"2016 4th IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Dynamic Improvements in a Cloud-Based Speech Recognition Engine by Incorporating Trending Data\",\"authors\":\"Milind Bhavsar, Prudhvi Kosaraju, G. Ananthakrishnan, Gurudas Subray Shet, Saurav Anand\",\"doi\":\"10.1109/MobileCloud.2016.12\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the advancement of speech recognition technologies, there is an increase in the adoption of voice interfaces on mobile-based platforms. While, developing a general purpose Automatic Speech Recognition (ASR) which can understand voice commands is important, the contexts of how people interact with their mobile device change very rapidly. Due to the high processing complexity of the ASR engine, much of the processing of trending data is being carried out on cloudplatforms. Changed content regarding news, music, movies and TV series change the focus of interaction with voice based interfaces. Hence ASR engines trained on a static vocabulary may not be able to adapt to the changing contexts. The focus of this paper is to first describe the problems faced in incorporating dynamically changing vocabulary and contexts into an ASR engine. We then propose a novel solution which shows a relative improvement of 38 percent utterance accuracy on newly added content without compromising on the overall accuracy and stability of the system.\",\"PeriodicalId\":176270,\"journal\":{\"name\":\"2016 4th IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 4th IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MobileCloud.2016.12\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 4th IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MobileCloud.2016.12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

随着语音识别技术的进步，在移动平台上采用语音接口的人越来越多。虽然开发一种能够理解语音命令的通用自动语音识别(ASR)很重要，但人们与移动设备交互的环境变化非常迅速。由于ASR引擎的高处理复杂性，趋势数据的大部分处理都在云平台上进行。新闻、音乐、电影和电视剧等内容的改变改变了基于语音界面的交互重点。因此，在静态词汇表上训练的ASR引擎可能无法适应不断变化的上下文。本文的重点是首先描述将动态变化的词汇和上下文合并到ASR引擎中所面临的问题。然后，我们提出了一种新的解决方案，在不影响系统整体准确性和稳定性的情况下，在新添加的内容上相对提高了38%的话语准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Dynamic Improvements in a Cloud-Based Speech Recognition Engine by Incorporating Trending Data

With the advancement of speech recognition technologies, there is an increase in the adoption of voice interfaces on mobile-based platforms. While, developing a general purpose Automatic Speech Recognition (ASR) which can understand voice commands is important, the contexts of how people interact with their mobile device change very rapidly. Due to the high processing complexity of the ASR engine, much of the processing of trending data is being carried out on cloudplatforms. Changed content regarding news, music, movies and TV series change the focus of interaction with voice based interfaces. Hence ASR engines trained on a static vocabulary may not be able to adapt to the changing contexts. The focus of this paper is to first describe the problems faced in incorporating dynamically changing vocabulary and contexts into an ASR engine. We then propose a novel solution which shows a relative improvement of 38 percent utterance accuracy on newly added content without compromising on the overall accuracy and stability of the system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 4th IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud)

自引率

0.00%

发文量