基于经验小波变换的语音信号识别语义特征提取方法

Q3 Computer Science

Radioelectronic and Computer Systems Pub Date : 2023-09-29 DOI:10.32620/reks.2023.3.09

Oleksandr Lavrynenko, Denis Bakhtiiarov, Vitaliy Kurushkin, Serhii Zavhorodnii, Veniamin Antonov, Petro Stanko

{"title":"基于经验小波变换的语音信号识别语义特征提取方法","authors":"Oleksandr Lavrynenko, Denis Bakhtiiarov, Vitaliy Kurushkin, Serhii Zavhorodnii, Veniamin Antonov, Petro Stanko","doi":"10.32620/reks.2023.3.09","DOIUrl":null,"url":null,"abstract":"The subject of this study is methods for improving the efficiency of semantic coding of speech signals. The purpose of this study is to develop a method for improving the efficiency of semantic coding of speech signals. Coding efficiency refers to the reduction of the information transmission rate with a given probability of error-free recognition of semantic features of speech signals, which will significantly reduce the required source bandwidth, thereby increasing the communication channel bandwidth. To achieve this goal, it is necessary to solve the following scientific tasks: (1) to investigate a known method for improving the efficiency of semantic coding of speech signals based on mel-frequency cepstral coefficients; (2) to substantiate the effectiveness of using the adaptive empirical wavelet transform in the tasks of multiple-scale analysis and semantic coding of speech signals; (3) to develop a method of semantic coding of speech signals based on adaptive empirical wavelet transform with further application of Hilbert spectral analysis and optimal thresholding; and (4) to perform an objective quantitative assessment of the increase in the efficiency of the developed method of semantic coding of speech signals in contrast to the existing method. The following scientific results were obtained during the study: a method of semantic coding of speech signals based on empirical wavelet transform is developed for the first time, which differs from existing methods by constructing a set of adaptive bandpass Meyer wavelet filters with further application of Hilbert spectral analysis to find the instantaneous amplitudes and frequencies of the functions of internal empirical modes, which will allow the identification of semantic features of speech signals and increase the efficiency of their coding; for the first time, it is proposed to use the method of adaptive empirical wavelet transform in the tasks of multiple-scale analysis and semantic coding of speech signals, which will increase the efficiency of spectral analysis by decomposing the high-frequency speech oscillation into its low-frequency components, namely internal empirical modes; the method of semantic coding of speech signals based on mel-frequency cepstral coefficients was further developed, but using the basic principles of adaptive spectral analysis with the help of empirical wavelet transform, which increases the efficiency of this method. Conclusions: We developed a method for semantic coding of speech signals based on empirical wavelet transform, which reduces the encoding rate from 320 to 192 bps and the required bandwidth from 40 to 24 Hz with a probability of error-free recognition of approximately 0.96 (96%) and a signal-to-noise ratio of 48 dB, according to which its efficiency is increased by 1.6 times as compared to the existing method. We developed an algorithm for semantic coding of speech signals based on empirical wavelet transform and its software implementation in the MATLAB R2022b programing language.","PeriodicalId":36122,"journal":{"name":"Radioelectronic and Computer Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A method for extracting the semantic features of speech signal recognition based on empirical wavelet transform\",\"authors\":\"Oleksandr Lavrynenko, Denis Bakhtiiarov, Vitaliy Kurushkin, Serhii Zavhorodnii, Veniamin Antonov, Petro Stanko\",\"doi\":\"10.32620/reks.2023.3.09\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The subject of this study is methods for improving the efficiency of semantic coding of speech signals. The purpose of this study is to develop a method for improving the efficiency of semantic coding of speech signals. Coding efficiency refers to the reduction of the information transmission rate with a given probability of error-free recognition of semantic features of speech signals, which will significantly reduce the required source bandwidth, thereby increasing the communication channel bandwidth. To achieve this goal, it is necessary to solve the following scientific tasks: (1) to investigate a known method for improving the efficiency of semantic coding of speech signals based on mel-frequency cepstral coefficients; (2) to substantiate the effectiveness of using the adaptive empirical wavelet transform in the tasks of multiple-scale analysis and semantic coding of speech signals; (3) to develop a method of semantic coding of speech signals based on adaptive empirical wavelet transform with further application of Hilbert spectral analysis and optimal thresholding; and (4) to perform an objective quantitative assessment of the increase in the efficiency of the developed method of semantic coding of speech signals in contrast to the existing method. The following scientific results were obtained during the study: a method of semantic coding of speech signals based on empirical wavelet transform is developed for the first time, which differs from existing methods by constructing a set of adaptive bandpass Meyer wavelet filters with further application of Hilbert spectral analysis to find the instantaneous amplitudes and frequencies of the functions of internal empirical modes, which will allow the identification of semantic features of speech signals and increase the efficiency of their coding; for the first time, it is proposed to use the method of adaptive empirical wavelet transform in the tasks of multiple-scale analysis and semantic coding of speech signals, which will increase the efficiency of spectral analysis by decomposing the high-frequency speech oscillation into its low-frequency components, namely internal empirical modes; the method of semantic coding of speech signals based on mel-frequency cepstral coefficients was further developed, but using the basic principles of adaptive spectral analysis with the help of empirical wavelet transform, which increases the efficiency of this method. Conclusions: We developed a method for semantic coding of speech signals based on empirical wavelet transform, which reduces the encoding rate from 320 to 192 bps and the required bandwidth from 40 to 24 Hz with a probability of error-free recognition of approximately 0.96 (96%) and a signal-to-noise ratio of 48 dB, according to which its efficiency is increased by 1.6 times as compared to the existing method. We developed an algorithm for semantic coding of speech signals based on empirical wavelet transform and its software implementation in the MATLAB R2022b programing language.\",\"PeriodicalId\":36122,\"journal\":{\"name\":\"Radioelectronic and Computer Systems\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Radioelectronic and Computer Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32620/reks.2023.3.09\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radioelectronic and Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32620/reks.2023.3.09","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

摘要

本研究的主题是如何提高语音信号的语义编码效率。本研究的目的是开发一种提高语音信号语义编码效率的方法。编码效率是指在对语音信号的语义特征进行无错误识别的概率给定的情况下，降低信息传输速率，从而显著降低所需的源带宽，从而增加通信信道带宽。为了实现这一目标，需要解决以下科学任务:(1)研究一种已知的基于mel-frequency倒谱系数的提高语音信号语义编码效率的方法;(2)验证了自适应经验小波变换在语音信号多尺度分析和语义编码任务中的有效性;(3)进一步应用希尔伯特谱分析和最优阈值法，提出一种基于自适应经验小波变换的语音信号语义编码方法;(4)对所开发的语音信号语义编码方法与现有方法相比效率的提高进行客观定量评估。在研究过程中获得了以下科学成果:首次提出了一种基于经验小波变换的语音信号语义编码方法，区别于现有方法，通过构造一组自适应带通Meyer小波滤波器，进一步应用Hilbert谱分析，找出内部经验模态函数的瞬时幅值和频率，从而识别语音信号的语义特征，提高语音信号的编码效率;首次提出将自适应经验小波变换方法应用于语音信号的多尺度分析和语义编码任务中，将高频语音振荡分解为其低频分量，即内部经验模态，提高了频谱分析的效率;进一步发展了基于梅尔频倒谱系数的语音信号语义编码方法，但采用了经验小波变换辅助自适应谱分析的基本原理，提高了该方法的效率。结论:提出了一种基于经验小波变换的语音信号语义编码方法，将编码速率从320降低到192bps，所需带宽从40降低到24hz，无错识别率约为0.96(96%)，信噪比为48 dB，效率比现有方法提高了1.6倍。提出了一种基于经验小波变换的语音信号语义编码算法，并在MATLAB R2022b编程语言下进行了软件实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A method for extracting the semantic features of speech signal recognition based on empirical wavelet transform

The subject of this study is methods for improving the efficiency of semantic coding of speech signals. The purpose of this study is to develop a method for improving the efficiency of semantic coding of speech signals. Coding efficiency refers to the reduction of the information transmission rate with a given probability of error-free recognition of semantic features of speech signals, which will significantly reduce the required source bandwidth, thereby increasing the communication channel bandwidth. To achieve this goal, it is necessary to solve the following scientific tasks: (1) to investigate a known method for improving the efficiency of semantic coding of speech signals based on mel-frequency cepstral coefficients; (2) to substantiate the effectiveness of using the adaptive empirical wavelet transform in the tasks of multiple-scale analysis and semantic coding of speech signals; (3) to develop a method of semantic coding of speech signals based on adaptive empirical wavelet transform with further application of Hilbert spectral analysis and optimal thresholding; and (4) to perform an objective quantitative assessment of the increase in the efficiency of the developed method of semantic coding of speech signals in contrast to the existing method. The following scientific results were obtained during the study: a method of semantic coding of speech signals based on empirical wavelet transform is developed for the first time, which differs from existing methods by constructing a set of adaptive bandpass Meyer wavelet filters with further application of Hilbert spectral analysis to find the instantaneous amplitudes and frequencies of the functions of internal empirical modes, which will allow the identification of semantic features of speech signals and increase the efficiency of their coding; for the first time, it is proposed to use the method of adaptive empirical wavelet transform in the tasks of multiple-scale analysis and semantic coding of speech signals, which will increase the efficiency of spectral analysis by decomposing the high-frequency speech oscillation into its low-frequency components, namely internal empirical modes; the method of semantic coding of speech signals based on mel-frequency cepstral coefficients was further developed, but using the basic principles of adaptive spectral analysis with the help of empirical wavelet transform, which increases the efficiency of this method. Conclusions: We developed a method for semantic coding of speech signals based on empirical wavelet transform, which reduces the encoding rate from 320 to 192 bps and the required bandwidth from 40 to 24 Hz with a probability of error-free recognition of approximately 0.96 (96%) and a signal-to-noise ratio of 48 dB, according to which its efficiency is increased by 1.6 times as compared to the existing method. We developed an algorithm for semantic coding of speech signals based on empirical wavelet transform and its software implementation in the MATLAB R2022b programing language.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Radioelectronic and Computer Systems Computer Science-Computer Graphics and Computer-Aided Design

CiteScore

3.60

自引率

0.00%

发文量

审稿时长

2 weeks