{"title":"Afan Oromo Speech-Based Computer Command and Control: An Evaluation with Selected Commands","authors":"Kebede Teshite, Getachew Mamo, Kris Calpotura","doi":"10.1155/2023/9959015","DOIUrl":null,"url":null,"abstract":"Speech-based computer command and control utilize natural speech to enable computers to understand human language and execute tasks through commands. However, there has been no study or development of a speech-based command and control system for Microsoft Word in Afan Oromo. The primary aim of this research is to investigate and develop a speech-based command and control system for Afan Oromo using a selected set of command-and-control words from MS Word. To accomplish this objective, a speech recognizer was developed using the HTK toolkit, employing a small vocabulary, isolated words, speaker independence, and HMM-based techniques. The translation of the selected MS command words from English to Afan Oromo was completed in order to develop this automatic speech-based computer command system. Audio recordings were obtained from 38 speakers (16 females and 22 males) aged between 18 and 40 years, based on their availability. Word-level speech recognition was performed using MFCC and data processing, which are widely used and are effective approaches in speech recognition. Out of a total of 64 MS command words, 54 words (84.37%) were used for training and 10 words (15.63%) were used for testing. Live and nonlive evaluation techniques were employed to assess the performance of the recognizer. The live recognizer, which considers variations in the environment, outperformed the nonlive recognizer due to the influence of neighboring phones. The performance results for the monophone tied state, triphone, and triphone-based recognizers were 78.12%, 86.87%, and 88.99%, respectively. Thus, the triphone-based recognizer exhibited the best performance among the nonlive recognizers. The challenges of limited resources in this research study were limited to investigate speech-based commands for computers using only selected MS commands, which play a crucial role in text processing. In order to evaluate a speech-based interface in a real environment, there were no components available for object-as-a-service. The experimental findings of this study demonstrated that if an adequate amount of language resources was available, a computer-based Afan Oromo speech-based interface for command-and-control purposes could be developed.","PeriodicalId":44873,"journal":{"name":"Advances in Human-Computer Interaction","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Human-Computer Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2023/9959015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Speech-based computer command and control utilize natural speech to enable computers to understand human language and execute tasks through commands. However, there has been no study or development of a speech-based command and control system for Microsoft Word in Afan Oromo. The primary aim of this research is to investigate and develop a speech-based command and control system for Afan Oromo using a selected set of command-and-control words from MS Word. To accomplish this objective, a speech recognizer was developed using the HTK toolkit, employing a small vocabulary, isolated words, speaker independence, and HMM-based techniques. The translation of the selected MS command words from English to Afan Oromo was completed in order to develop this automatic speech-based computer command system. Audio recordings were obtained from 38 speakers (16 females and 22 males) aged between 18 and 40 years, based on their availability. Word-level speech recognition was performed using MFCC and data processing, which are widely used and are effective approaches in speech recognition. Out of a total of 64 MS command words, 54 words (84.37%) were used for training and 10 words (15.63%) were used for testing. Live and nonlive evaluation techniques were employed to assess the performance of the recognizer. The live recognizer, which considers variations in the environment, outperformed the nonlive recognizer due to the influence of neighboring phones. The performance results for the monophone tied state, triphone, and triphone-based recognizers were 78.12%, 86.87%, and 88.99%, respectively. Thus, the triphone-based recognizer exhibited the best performance among the nonlive recognizers. The challenges of limited resources in this research study were limited to investigate speech-based commands for computers using only selected MS commands, which play a crucial role in text processing. In order to evaluate a speech-based interface in a real environment, there were no components available for object-as-a-service. The experimental findings of this study demonstrated that if an adequate amount of language resources was available, a computer-based Afan Oromo speech-based interface for command-and-control purposes could be developed.