{"title":"多媒体应用中自动语音识别系统的使用","authors":"Marcos Valadão Gualberto Ferreira, J. Souza","doi":"10.1145/3126858.3131630","DOIUrl":null,"url":null,"abstract":"The need to retrieve information in multimedia content increases the demand for systems that use automatic speech recognition. A speech recognition system enables the computer to interpret audio signals, generating approximate textual transcriptions. These systems are based on probabilistic models that create a robust and correct model for human speech. In this paper it is presented a speech recognition systems architecture and a description of its basic components: the acoustic model, language model, lexical and decoder. The training process of acoustic and language models is also presented. Finally, it its presented how these systems can be used in several applications.","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Use of Automatic Speech Recognition Systems for Multimedia Applications\",\"authors\":\"Marcos Valadão Gualberto Ferreira, J. Souza\",\"doi\":\"10.1145/3126858.3131630\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The need to retrieve information in multimedia content increases the demand for systems that use automatic speech recognition. A speech recognition system enables the computer to interpret audio signals, generating approximate textual transcriptions. These systems are based on probabilistic models that create a robust and correct model for human speech. In this paper it is presented a speech recognition systems architecture and a description of its basic components: the acoustic model, language model, lexical and decoder. The training process of acoustic and language models is also presented. Finally, it its presented how these systems can be used in several applications.\",\"PeriodicalId\":338362,\"journal\":{\"name\":\"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3126858.3131630\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3126858.3131630","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Use of Automatic Speech Recognition Systems for Multimedia Applications
The need to retrieve information in multimedia content increases the demand for systems that use automatic speech recognition. A speech recognition system enables the computer to interpret audio signals, generating approximate textual transcriptions. These systems are based on probabilistic models that create a robust and correct model for human speech. In this paper it is presented a speech recognition systems architecture and a description of its basic components: the acoustic model, language model, lexical and decoder. The training process of acoustic and language models is also presented. Finally, it its presented how these systems can be used in several applications.