Design of a Voice Recognition System Using Artificial Neural Network

International Journal of Electrical and Computer Engineering Research Pub Date : 2024-03-15 DOI:10.53375/ijecer.2024.371

Daniel Oluwatobi Mayowa, I. A. Olajide

{"title":"Design of a Voice Recognition System Using Artificial Neural Network","authors":"Daniel Oluwatobi Mayowa, I. A. Olajide","doi":"10.53375/ijecer.2024.371","DOIUrl":null,"url":null,"abstract":"Voice recognition systems have gained significant prevalence in our everyday lives, encompassing a wide range of applications, from virtual assistants on smartphones to voice-controlled home automation systems. This research paper presents a comprehensive design and implementation of a voice recognition security system employing artificial neural networks. The system's training involved a dataset consisting of 900 audio samples collected from 10 distinct speakers, enabling the resulting model to accurately classify the speaker of a given audio sample. For the implementation of the voice recognition system, Python serves as the primary programming language. The system leverages the Keras library, which offers a high-level interface for constructing and training neural networks, with efficient computation facilitated by the TensorFlow back-end. Additionally, the Flask framework, a Python-based web framework, was utilized to create a user interface in the form of a web application for the voice recognition system. To effectively train the artificial neural network, the audio data undergoes preprocessing, involving the extraction of relevant features from the audio samples. Subsequently, during the preprocessing phase, the audio data is labelled, and the neural network is trained on this labelled dataset to learn the classification of different speakers. The trained model was rigorously tested on a set of previously unseen audio samples, yielding an impressive classification accuracy exceeding 96%. The finalized model will be integrated into the web application, enabling users to upload audio files and receive accurate predictions regarding the speaker's identity. This paper demonstrates the efficacy of artificial neural networks in the context of voice recognition systems, while also providing a practical framework for constructing such systems using readily available tools and libraries.","PeriodicalId":111426,"journal":{"name":"International Journal of Electrical and Computer Engineering Research","volume":" 57","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Electrical and Computer Engineering Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.53375/ijecer.2024.371","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Voice recognition systems have gained significant prevalence in our everyday lives, encompassing a wide range of applications, from virtual assistants on smartphones to voice-controlled home automation systems. This research paper presents a comprehensive design and implementation of a voice recognition security system employing artificial neural networks. The system's training involved a dataset consisting of 900 audio samples collected from 10 distinct speakers, enabling the resulting model to accurately classify the speaker of a given audio sample. For the implementation of the voice recognition system, Python serves as the primary programming language. The system leverages the Keras library, which offers a high-level interface for constructing and training neural networks, with efficient computation facilitated by the TensorFlow back-end. Additionally, the Flask framework, a Python-based web framework, was utilized to create a user interface in the form of a web application for the voice recognition system. To effectively train the artificial neural network, the audio data undergoes preprocessing, involving the extraction of relevant features from the audio samples. Subsequently, during the preprocessing phase, the audio data is labelled, and the neural network is trained on this labelled dataset to learn the classification of different speakers. The trained model was rigorously tested on a set of previously unseen audio samples, yielding an impressive classification accuracy exceeding 96%. The finalized model will be integrated into the web application, enabling users to upload audio files and receive accurate predictions regarding the speaker's identity. This paper demonstrates the efficacy of artificial neural networks in the context of voice recognition systems, while also providing a practical framework for constructing such systems using readily available tools and libraries.

查看原文本刊更多论文

利用人工神经网络设计语音识别系统

语音识别系统已在我们的日常生活中得到广泛应用，从智能手机上的虚拟助手到语音控制的家庭自动化系统。本研究论文全面介绍了采用人工神经网络的语音识别安全系统的设计和实施。该系统的训练涉及到一个数据集，该数据集由从 10 个不同的扬声器收集的 900 个音频样本组成，从而使生成的模型能够准确地对给定音频样本的扬声器进行分类。在语音识别系统的实施过程中，Python 是主要的编程语言。该系统利用 Keras 库，该库为构建和训练神经网络提供了一个高级界面，TensorFlow 后端为高效计算提供了便利。此外，还利用基于 Python 的网络框架 Flask，以网络应用程序的形式为语音识别系统创建用户界面。为了有效地训练人工神经网络，需要对音频数据进行预处理，包括从音频样本中提取相关特征。随后，在预处理阶段，对音频数据进行标注，并在此标注数据集上训练神经网络，以学习对不同扬声器的分类。训练好的模型在一组以前未见过的音频样本上进行了严格测试，分类准确率超过 96%，令人印象深刻。最终完成的模型将被集成到网络应用程序中，使用户能够上传音频文件并获得有关说话者身份的准确预测。本文展示了人工神经网络在语音识别系统中的功效，同时也为使用现成的工具和库构建此类系统提供了一个实用框架。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Electrical and Computer Engineering Research

自引率

0.00%

发文量