A. Abougarair, Mohamed KI Aburakhis, Mohamed O Zaroug
{"title":"Design and implementation of smart voice assistant and recognizing academic words","authors":"A. Abougarair, Mohamed KI Aburakhis, Mohamed O Zaroug","doi":"10.15406/iratj.2022.08.00240","DOIUrl":null,"url":null,"abstract":"This paper approaches the use of a Virtual Assistant using neural networks for recognition of commonly used words. The main purpose is to facilitate the users’ daily lives by sensing the voice and interpreting it into action. Alice, which is the name of the assistant, is implemented based on four main techniques: Hot word detection, Voice to Text conversion, Intent recognition, and Text to Voice conversion. Linux is the operating system of choice, for developing and running the assistant because it is in the public domain, also, Linux has been implemented on most Single-board computers. Python is chosen as a development language due to its capabilities and compatibility with various APIs and libraries, which are deemed necessary for the project. The virtual assistant will be required to communicate with IoT devices. In addition, a speech recognition system is created in order to recognize the significant technical words. An artificial neural network (ANN) with different structure networks and training algorithms is utilized in conjunction with the Mel Frequency Cepstral Coefficient (MFCC) feature extraction technique to increase the identification rate effectively and find the optimal performance. For training purposes, the Levenberg-Marquardt (LM) and BGFS Quasi-Newton Resilient Backpropagation are compared using 10 MFCC, utilizing from 10 to 50 neurons increasing in increments of 10 similarly for 13MFCC the training is done utilizing from between 10 to 50 neurons.","PeriodicalId":346234,"journal":{"name":"International Robotics & Automation Journal","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Robotics & Automation Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15406/iratj.2022.08.00240","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
This paper approaches the use of a Virtual Assistant using neural networks for recognition of commonly used words. The main purpose is to facilitate the users’ daily lives by sensing the voice and interpreting it into action. Alice, which is the name of the assistant, is implemented based on four main techniques: Hot word detection, Voice to Text conversion, Intent recognition, and Text to Voice conversion. Linux is the operating system of choice, for developing and running the assistant because it is in the public domain, also, Linux has been implemented on most Single-board computers. Python is chosen as a development language due to its capabilities and compatibility with various APIs and libraries, which are deemed necessary for the project. The virtual assistant will be required to communicate with IoT devices. In addition, a speech recognition system is created in order to recognize the significant technical words. An artificial neural network (ANN) with different structure networks and training algorithms is utilized in conjunction with the Mel Frequency Cepstral Coefficient (MFCC) feature extraction technique to increase the identification rate effectively and find the optimal performance. For training purposes, the Levenberg-Marquardt (LM) and BGFS Quasi-Newton Resilient Backpropagation are compared using 10 MFCC, utilizing from 10 to 50 neurons increasing in increments of 10 similarly for 13MFCC the training is done utilizing from between 10 to 50 neurons.