Design of a Voice Recognition System Using Artificial Neural Network

Daniel Oluwatobi Mayowa, I. A. Olajide
{"title":"Design of a Voice Recognition System Using Artificial Neural Network","authors":"Daniel Oluwatobi Mayowa, I. A. Olajide","doi":"10.53375/ijecer.2024.371","DOIUrl":null,"url":null,"abstract":"Voice recognition systems have gained significant prevalence in our everyday lives, encompassing a wide range of applications, from virtual assistants on smartphones to voice-controlled home automation systems. This research paper presents a comprehensive design and implementation of a voice recognition security system employing artificial neural networks. The system's training involved a dataset consisting of 900 audio samples collected from 10 distinct speakers, enabling the resulting model to accurately classify the speaker of a given audio sample. For the implementation of the voice recognition system, Python serves as the primary programming language. The system leverages the Keras library, which offers a high-level interface for constructing and training neural networks, with efficient computation facilitated by the TensorFlow back-end. Additionally, the Flask framework, a Python-based web framework, was utilized to create a user interface in the form of a web application for the voice recognition system. To effectively train the artificial neural network, the audio data undergoes preprocessing, involving the extraction of relevant features from the audio samples. Subsequently, during the preprocessing phase, the audio data is labelled, and the neural network is trained on this labelled dataset to learn the classification of different speakers. The trained model was rigorously tested on a set of previously unseen audio samples, yielding an impressive classification accuracy exceeding 96%. The finalized model will be integrated into the web application, enabling users to upload audio files and receive accurate predictions regarding the speaker's identity. This paper demonstrates the efficacy of artificial neural networks in the context of voice recognition systems, while also providing a practical framework for constructing such systems using readily available tools and libraries.","PeriodicalId":111426,"journal":{"name":"International Journal of Electrical and Computer Engineering Research","volume":" 57","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Electrical and Computer Engineering Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.53375/ijecer.2024.371","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Voice recognition systems have gained significant prevalence in our everyday lives, encompassing a wide range of applications, from virtual assistants on smartphones to voice-controlled home automation systems. This research paper presents a comprehensive design and implementation of a voice recognition security system employing artificial neural networks. The system's training involved a dataset consisting of 900 audio samples collected from 10 distinct speakers, enabling the resulting model to accurately classify the speaker of a given audio sample. For the implementation of the voice recognition system, Python serves as the primary programming language. The system leverages the Keras library, which offers a high-level interface for constructing and training neural networks, with efficient computation facilitated by the TensorFlow back-end. Additionally, the Flask framework, a Python-based web framework, was utilized to create a user interface in the form of a web application for the voice recognition system. To effectively train the artificial neural network, the audio data undergoes preprocessing, involving the extraction of relevant features from the audio samples. Subsequently, during the preprocessing phase, the audio data is labelled, and the neural network is trained on this labelled dataset to learn the classification of different speakers. The trained model was rigorously tested on a set of previously unseen audio samples, yielding an impressive classification accuracy exceeding 96%. The finalized model will be integrated into the web application, enabling users to upload audio files and receive accurate predictions regarding the speaker's identity. This paper demonstrates the efficacy of artificial neural networks in the context of voice recognition systems, while also providing a practical framework for constructing such systems using readily available tools and libraries.
利用人工神经网络设计语音识别系统
语音识别系统已在我们的日常生活中得到广泛应用,从智能手机上的虚拟助手到语音控制的家庭自动化系统。本研究论文全面介绍了采用人工神经网络的语音识别安全系统的设计和实施。该系统的训练涉及到一个数据集,该数据集由从 10 个不同的扬声器收集的 900 个音频样本组成,从而使生成的模型能够准确地对给定音频样本的扬声器进行分类。在语音识别系统的实施过程中,Python 是主要的编程语言。该系统利用 Keras 库,该库为构建和训练神经网络提供了一个高级界面,TensorFlow 后端为高效计算提供了便利。此外,还利用基于 Python 的网络框架 Flask,以网络应用程序的形式为语音识别系统创建用户界面。为了有效地训练人工神经网络,需要对音频数据进行预处理,包括从音频样本中提取相关特征。随后,在预处理阶段,对音频数据进行标注,并在此标注数据集上训练神经网络,以学习对不同扬声器的分类。训练好的模型在一组以前未见过的音频样本上进行了严格测试,分类准确率超过 96%,令人印象深刻。最终完成的模型将被集成到网络应用程序中,使用户能够上传音频文件并获得有关说话者身份的准确预测。本文展示了人工神经网络在语音识别系统中的功效,同时也为使用现成的工具和库构建此类系统提供了一个实用框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信