一种新型的基于机器学习的人工语音盒

2022 Second International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE) Pub Date : 2022-12-16 DOI:10.1109/ICATIECE56365.2022.10046967

N. Kumar, Priya Nandihal, Madhumala R B, P. Pareek, Nikshepa T, Sowmya S R

{"title":"一种新型的基于机器学习的人工语音盒","authors":"N. Kumar, Priya Nandihal, Madhumala R B, P. Pareek, Nikshepa T, Sowmya S R","doi":"10.1109/ICATIECE56365.2022.10046967","DOIUrl":null,"url":null,"abstract":"Patients may experience a great deal of discomfort while undergoing rigorous medical procedures for the identification of vocal abnormalities. As a result, there has been a lot of interest in automated speech recognition and disorder detection approaches in recent years, and these methods have shown to be effective. Voice recordings have been acquired from the Saarbruecken Voice Database for the purpose of this study. The signals undergo preprocessing using Hybrid Wiener Filter Discrete Wavelet Transforms in order to de-noise and eliminate any silence that may have been there (HWFDWT). Cat Swarm Optimization is used to extract features, and Mel Frequency Cepstrum Coefficients are taken into account (CSOMFCC). Classification using Modified Optimized Back Propagation Network Disorder voice Classification is then used to sort the features in the end (MOBPNDC). In terms of Accuracy, Precision, Recall, F-Measure, and Time period, the classification scheme beats the current Support Vector Machine (SVM) and Back Propagation Neural Network (BPNN) approaches. The neural speech system is a gadget that enables individuals who are unable to talk to communicate their thoughts and emotions with the outside world. It is a piece of equipment that can record the electric pulses that are generated by the brain and turn them into a synthetic voice. - Provide an overview of the concept or solution that you want to build. The electrical activity of the brain will be recorded and then sent into a synthesiser. The Synthesizer will convert the signal into voice when it has finished decoding it. The voice that has been deciphered is then supplied to an artificial voice box. The brain's electrical activity is used to generate an artificial voice, which is then output via the box.","PeriodicalId":199942,"journal":{"name":"2022 Second International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel Machine Learning-Based Artificial Voice Box\",\"authors\":\"N. Kumar, Priya Nandihal, Madhumala R B, P. Pareek, Nikshepa T, Sowmya S R\",\"doi\":\"10.1109/ICATIECE56365.2022.10046967\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Patients may experience a great deal of discomfort while undergoing rigorous medical procedures for the identification of vocal abnormalities. As a result, there has been a lot of interest in automated speech recognition and disorder detection approaches in recent years, and these methods have shown to be effective. Voice recordings have been acquired from the Saarbruecken Voice Database for the purpose of this study. The signals undergo preprocessing using Hybrid Wiener Filter Discrete Wavelet Transforms in order to de-noise and eliminate any silence that may have been there (HWFDWT). Cat Swarm Optimization is used to extract features, and Mel Frequency Cepstrum Coefficients are taken into account (CSOMFCC). Classification using Modified Optimized Back Propagation Network Disorder voice Classification is then used to sort the features in the end (MOBPNDC). In terms of Accuracy, Precision, Recall, F-Measure, and Time period, the classification scheme beats the current Support Vector Machine (SVM) and Back Propagation Neural Network (BPNN) approaches. The neural speech system is a gadget that enables individuals who are unable to talk to communicate their thoughts and emotions with the outside world. It is a piece of equipment that can record the electric pulses that are generated by the brain and turn them into a synthetic voice. - Provide an overview of the concept or solution that you want to build. The electrical activity of the brain will be recorded and then sent into a synthesiser. The Synthesizer will convert the signal into voice when it has finished decoding it. The voice that has been deciphered is then supplied to an artificial voice box. The brain's electrical activity is used to generate an artificial voice, which is then output via the box.\",\"PeriodicalId\":199942,\"journal\":{\"name\":\"2022 Second International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Second International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICATIECE56365.2022.10046967\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Second International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICATIECE56365.2022.10046967","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

患者可能会经历很大的不适，而接受严格的医疗程序，以确定声带异常。因此，近年来人们对自动语音识别和障碍检测方法产生了很大的兴趣，这些方法已被证明是有效的。为了本研究的目的，我们从Saarbruecken语音数据库中获取了语音记录。使用混合维纳滤波离散小波变换对信号进行预处理，以去噪并消除可能存在的任何沉默(HWFDWT)。采用Cat群算法提取特征，并考虑了Mel频率倒谱系数(CSOMFCC)。然后使用改进的优化反向传播网络进行分类，最后使用无序语音分类(MOBPNDC)对特征进行分类。在准确率、精密度、召回率、F-Measure和时间段方面，该分类方案优于当前的支持向量机(SVM)和反向传播神经网络(BPNN)方法。神经语言系统是一种装置，可以使无法说话的人与外界交流他们的思想和情感。这是一种可以记录大脑产生的电脉冲并将其转化为合成声音的设备。-提供您想要构建的概念或解决方案的概述。大脑的电活动将被记录下来，然后送入合成器。当合成器完成解码后，它将把信号转换成声音。然后，被破译的声音被提供给一个人工语音箱。大脑的电活动被用来产生一种人造声音，然后通过盒子输出。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Novel Machine Learning-Based Artificial Voice Box

Patients may experience a great deal of discomfort while undergoing rigorous medical procedures for the identification of vocal abnormalities. As a result, there has been a lot of interest in automated speech recognition and disorder detection approaches in recent years, and these methods have shown to be effective. Voice recordings have been acquired from the Saarbruecken Voice Database for the purpose of this study. The signals undergo preprocessing using Hybrid Wiener Filter Discrete Wavelet Transforms in order to de-noise and eliminate any silence that may have been there (HWFDWT). Cat Swarm Optimization is used to extract features, and Mel Frequency Cepstrum Coefficients are taken into account (CSOMFCC). Classification using Modified Optimized Back Propagation Network Disorder voice Classification is then used to sort the features in the end (MOBPNDC). In terms of Accuracy, Precision, Recall, F-Measure, and Time period, the classification scheme beats the current Support Vector Machine (SVM) and Back Propagation Neural Network (BPNN) approaches. The neural speech system is a gadget that enables individuals who are unable to talk to communicate their thoughts and emotions with the outside world. It is a piece of equipment that can record the electric pulses that are generated by the brain and turn them into a synthetic voice. - Provide an overview of the concept or solution that you want to build. The electrical activity of the brain will be recorded and then sent into a synthesiser. The Synthesizer will convert the signal into voice when it has finished decoding it. The voice that has been deciphered is then supplied to an artificial voice box. The brain's electrical activity is used to generate an artificial voice, which is then output via the box.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 Second International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE)

自引率

0.00%

发文量