{"title":"使用卷积神经网络定制唤醒词与关键字定位","authors":"T. Tsai, Ping-Cheng Hao","doi":"10.1109/ISOCC47750.2019.9027708","DOIUrl":null,"url":null,"abstract":"In this paper, a customized wake-up word system combined with key word spotting using neural network was proposed. This system is divided into three phases: training wake-up word phase, detecting wake-up word phase and key word spotting phase. In training phase, user can say any word in any language and system will automatically count how many syllable of this word. If several syllables are in the range, system will accept this customized wake-up word. Next, the word will be extracted the features by Mel-Frequency Cepstral Coefficients (MFCC) method. It can be used for speaker model, speech model and state sequence for next phase. In detecting phase, system detects an unknown voice segment and compares it with models. After these steps, system will determine to wake up or not. If user says the right wake-up word, system goes to next phase. In key word spotting phase, the command words are fixed. The system is designed using convolutional neural network for key word spotting model. Moreover, all processes are executed without Internet to protect user privacy. This system can give a good result with a very small amount of wake-up word training data, and run in real-time.","PeriodicalId":113802,"journal":{"name":"2019 International SoC Design Conference (ISOCC)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Customized Wake-Up Word with Key Word Spotting using Convolutional Neural Network\",\"authors\":\"T. Tsai, Ping-Cheng Hao\",\"doi\":\"10.1109/ISOCC47750.2019.9027708\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a customized wake-up word system combined with key word spotting using neural network was proposed. This system is divided into three phases: training wake-up word phase, detecting wake-up word phase and key word spotting phase. In training phase, user can say any word in any language and system will automatically count how many syllable of this word. If several syllables are in the range, system will accept this customized wake-up word. Next, the word will be extracted the features by Mel-Frequency Cepstral Coefficients (MFCC) method. It can be used for speaker model, speech model and state sequence for next phase. In detecting phase, system detects an unknown voice segment and compares it with models. After these steps, system will determine to wake up or not. If user says the right wake-up word, system goes to next phase. In key word spotting phase, the command words are fixed. The system is designed using convolutional neural network for key word spotting model. Moreover, all processes are executed without Internet to protect user privacy. This system can give a good result with a very small amount of wake-up word training data, and run in real-time.\",\"PeriodicalId\":113802,\"journal\":{\"name\":\"2019 International SoC Design Conference (ISOCC)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International SoC Design Conference (ISOCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISOCC47750.2019.9027708\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International SoC Design Conference (ISOCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISOCC47750.2019.9027708","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Customized Wake-Up Word with Key Word Spotting using Convolutional Neural Network
In this paper, a customized wake-up word system combined with key word spotting using neural network was proposed. This system is divided into three phases: training wake-up word phase, detecting wake-up word phase and key word spotting phase. In training phase, user can say any word in any language and system will automatically count how many syllable of this word. If several syllables are in the range, system will accept this customized wake-up word. Next, the word will be extracted the features by Mel-Frequency Cepstral Coefficients (MFCC) method. It can be used for speaker model, speech model and state sequence for next phase. In detecting phase, system detects an unknown voice segment and compares it with models. After these steps, system will determine to wake up or not. If user says the right wake-up word, system goes to next phase. In key word spotting phase, the command words are fixed. The system is designed using convolutional neural network for key word spotting model. Moreover, all processes are executed without Internet to protect user privacy. This system can give a good result with a very small amount of wake-up word training data, and run in real-time.