基于emd的语音信号处理新方法

A. Alimuradov, A. Tychkov, P. Churakov, Bogdan A. Porezanov, Ilya O. Steshkin, Kirill E. Platonov, A. Baranova, D. S. Dudnikov
{"title":"基于emd的语音信号处理新方法","authors":"A. Alimuradov, A. Tychkov, P. Churakov, Bogdan A. Porezanov, Ilya O. Steshkin, Kirill E. Platonov, A. Baranova, D. S. Dudnikov","doi":"10.1109/dspa53304.2022.9790747","DOIUrl":null,"url":null,"abstract":"The article presents a novel technological procedure for speech signal processing based on the empirical mode decomposition, being an adaptive time-frequency analysis method. The proposed procedure is based on the uniform splitting of the original speech signal into fragments, the decomposition of fragments into empirical modes, and the formation of new mode speech signals. The goal of the technological procedure elaboration is to expand the space for informatively significant amplitude, time, frequency, and energy characteristics of the original speech signal. A brief description of various types of empirical mode decomposition has been presented, and their advantages and disadvantages have been revealed. The functionality of the proposed technological procedure has been detailed, and the research outcomes have been reported. An analysis of the research results has evidenced that the minimum time for the formation of a set of modal speech signals is afforded when analyzing 300–1000 ms fragments; the minimum error in the formation of a set of mode speech signals is obtained when the fragments are decomposed into 8–10 empirical modes, and the difference between the original and reconstructed signals being less than 0.001 V (0.1 %). It has been concluded that the proposed technological procedure actually provides an expansion of the space for informatively significant amplitude, time, frequency, and energy characteristics due to the formation of a set of new mode speech signals. Thus, it can be efficiently used in the formation of an optimal set of speech parameters relevant to naturally expressed human emotions.","PeriodicalId":428492,"journal":{"name":"2022 24th International Conference on Digital Signal Processing and its Applications (DSPA)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Novel EMD-Based Technological Procedure for Speech Signal Processing\",\"authors\":\"A. Alimuradov, A. Tychkov, P. Churakov, Bogdan A. Porezanov, Ilya O. Steshkin, Kirill E. Platonov, A. Baranova, D. S. Dudnikov\",\"doi\":\"10.1109/dspa53304.2022.9790747\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The article presents a novel technological procedure for speech signal processing based on the empirical mode decomposition, being an adaptive time-frequency analysis method. The proposed procedure is based on the uniform splitting of the original speech signal into fragments, the decomposition of fragments into empirical modes, and the formation of new mode speech signals. The goal of the technological procedure elaboration is to expand the space for informatively significant amplitude, time, frequency, and energy characteristics of the original speech signal. A brief description of various types of empirical mode decomposition has been presented, and their advantages and disadvantages have been revealed. The functionality of the proposed technological procedure has been detailed, and the research outcomes have been reported. An analysis of the research results has evidenced that the minimum time for the formation of a set of modal speech signals is afforded when analyzing 300–1000 ms fragments; the minimum error in the formation of a set of mode speech signals is obtained when the fragments are decomposed into 8–10 empirical modes, and the difference between the original and reconstructed signals being less than 0.001 V (0.1 %). It has been concluded that the proposed technological procedure actually provides an expansion of the space for informatively significant amplitude, time, frequency, and energy characteristics due to the formation of a set of new mode speech signals. Thus, it can be efficiently used in the formation of an optimal set of speech parameters relevant to naturally expressed human emotions.\",\"PeriodicalId\":428492,\"journal\":{\"name\":\"2022 24th International Conference on Digital Signal Processing and its Applications (DSPA)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 24th International Conference on Digital Signal Processing and its Applications (DSPA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/dspa53304.2022.9790747\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 24th International Conference on Digital Signal Processing and its Applications (DSPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/dspa53304.2022.9790747","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

本文提出了一种基于经验模态分解的语音信号处理新方法,即自适应时频分析方法。该方法是将原始语音信号均匀地分割成多个片段,再将片段分解成经验模态,形成新的模态语音信号。该技术程序阐述的目标是为原始语音信号的信息显著幅度、时间、频率和能量特征扩展空间。简要介绍了各种类型的经验模态分解,并揭示了它们的优缺点。详细介绍了所提出的工艺流程的功能,并报告了研究成果。对研究结果的分析表明,在分析300 ~ 1000 ms的语音片段时,形成一组模态语音信号的时间最短;将语音片段分解为8 ~ 10个经验模态,且原始信号与重构信号之差小于0.001 V(0.1%)时,形成一组模态语音信号的误差最小。结论是,由于形成了一组新模式语音信号,所提出的技术程序实际上为信息显著的幅度、时间、频率和能量特征提供了空间的扩展。因此,它可以有效地用于形成与自然表达的人类情感相关的最优语音参数集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Novel EMD-Based Technological Procedure for Speech Signal Processing
The article presents a novel technological procedure for speech signal processing based on the empirical mode decomposition, being an adaptive time-frequency analysis method. The proposed procedure is based on the uniform splitting of the original speech signal into fragments, the decomposition of fragments into empirical modes, and the formation of new mode speech signals. The goal of the technological procedure elaboration is to expand the space for informatively significant amplitude, time, frequency, and energy characteristics of the original speech signal. A brief description of various types of empirical mode decomposition has been presented, and their advantages and disadvantages have been revealed. The functionality of the proposed technological procedure has been detailed, and the research outcomes have been reported. An analysis of the research results has evidenced that the minimum time for the formation of a set of modal speech signals is afforded when analyzing 300–1000 ms fragments; the minimum error in the formation of a set of mode speech signals is obtained when the fragments are decomposed into 8–10 empirical modes, and the difference between the original and reconstructed signals being less than 0.001 V (0.1 %). It has been concluded that the proposed technological procedure actually provides an expansion of the space for informatively significant amplitude, time, frequency, and energy characteristics due to the formation of a set of new mode speech signals. Thus, it can be efficiently used in the formation of an optimal set of speech parameters relevant to naturally expressed human emotions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信