{"title":"Architecture for Low Power Large Vocabulary Speech Recognition","authors":"Dhruba Chandra, U. Pazhayaveetil, P. Franzon","doi":"10.1109/SOCC.2006.283836","DOIUrl":null,"url":null,"abstract":"This paper proposes an architecture for real-time large vocabulary speech recognition on a mobile embedded device. The speech recognition system is based on Hidden Markov Model (HMM), which involves complex mathematical operations such as probability estimation and Viterbi decoding. This computational nature makes it power hungry and realtime recognition is not achieved by porting software solutions on embedded device. Our system architecture has a low power embedded processor and dedicated ASIC units for complex computations. These units operate at a low frequency of 50 MHz thus consuming low power. The system uses RAM for the intermediate values and flash memory to store acoustic and language models for speech recognition.","PeriodicalId":345714,"journal":{"name":"2006 IEEE International SOC Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International SOC Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SOCC.2006.283836","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
This paper proposes an architecture for real-time large vocabulary speech recognition on a mobile embedded device. The speech recognition system is based on Hidden Markov Model (HMM), which involves complex mathematical operations such as probability estimation and Viterbi decoding. This computational nature makes it power hungry and realtime recognition is not achieved by porting software solutions on embedded device. Our system architecture has a low power embedded processor and dedicated ASIC units for complex computations. These units operate at a low frequency of 50 MHz thus consuming low power. The system uses RAM for the intermediate values and flash memory to store acoustic and language models for speech recognition.