{"title":"长短期记忆概念在“CD-NN-HMM”混合模型中的应用","authors":"Hinda Dridi, K. Ouni","doi":"10.1109/ATSIP.2018.8364510","DOIUrl":null,"url":null,"abstract":"Recently, the Long Short Term Memory (LSTM) architecture has been shown outperforming other state-of-the-art approaches, such as Deep Neural Network (DNN) and Convolutional Neural Network (CNN), in performances of many speech recognition tasks. The LSTM network aims to further improve the modeling of long-range temporal dynamics and to remedy the vanishing and exploding gradient problems of conventional reccurent neural network (RNN). Motivated by the tremendous success of the LSTM, we present in this paper a systematic approach of keywords spotting (KWS) in continuous speech. This system performs on two stages, in first one the continuous speech is decoded into phonetic flow using an hybrid model based LSTM network in combination with Hidden Markov Model (HMM) built with the open source speech recognition toolkit Kaldi, and in the second stage the keywords will be identified and detected from this phones sequence using the Classification and Regression Tree (CART) implemented with the software MATLAB. The work and experiments are conducted on the TIMIT data set.","PeriodicalId":332253,"journal":{"name":"2018 4th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Applying long short-term memory concept to hybrid “CD-NN-HMM” model for keywords spotting in continuous speech\",\"authors\":\"Hinda Dridi, K. Ouni\",\"doi\":\"10.1109/ATSIP.2018.8364510\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, the Long Short Term Memory (LSTM) architecture has been shown outperforming other state-of-the-art approaches, such as Deep Neural Network (DNN) and Convolutional Neural Network (CNN), in performances of many speech recognition tasks. The LSTM network aims to further improve the modeling of long-range temporal dynamics and to remedy the vanishing and exploding gradient problems of conventional reccurent neural network (RNN). Motivated by the tremendous success of the LSTM, we present in this paper a systematic approach of keywords spotting (KWS) in continuous speech. This system performs on two stages, in first one the continuous speech is decoded into phonetic flow using an hybrid model based LSTM network in combination with Hidden Markov Model (HMM) built with the open source speech recognition toolkit Kaldi, and in the second stage the keywords will be identified and detected from this phones sequence using the Classification and Regression Tree (CART) implemented with the software MATLAB. The work and experiments are conducted on the TIMIT data set.\",\"PeriodicalId\":332253,\"journal\":{\"name\":\"2018 4th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 4th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ATSIP.2018.8364510\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 4th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ATSIP.2018.8364510","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Applying long short-term memory concept to hybrid “CD-NN-HMM” model for keywords spotting in continuous speech
Recently, the Long Short Term Memory (LSTM) architecture has been shown outperforming other state-of-the-art approaches, such as Deep Neural Network (DNN) and Convolutional Neural Network (CNN), in performances of many speech recognition tasks. The LSTM network aims to further improve the modeling of long-range temporal dynamics and to remedy the vanishing and exploding gradient problems of conventional reccurent neural network (RNN). Motivated by the tremendous success of the LSTM, we present in this paper a systematic approach of keywords spotting (KWS) in continuous speech. This system performs on two stages, in first one the continuous speech is decoded into phonetic flow using an hybrid model based LSTM network in combination with Hidden Markov Model (HMM) built with the open source speech recognition toolkit Kaldi, and in the second stage the keywords will be identified and detected from this phones sequence using the Classification and Regression Tree (CART) implemented with the software MATLAB. The work and experiments are conducted on the TIMIT data set.