Method for asynchronous analysis of a glottal source based on a two-level autoregressive model of the speech signal

V. V. Savchenko, L. V. Savchenko
{"title":"Method for asynchronous analysis of a glottal source based on a two-level autoregressive model of the speech signal","authors":"V. V. Savchenko, L. V. Savchenko","doi":"10.32446/0368-1025it.2024-2-55-62","DOIUrl":null,"url":null,"abstract":"The task of analyzing a glottal source over a short observation interval is considered. The acute problem of insufficient performance of known methods for analyzing a glottal source is pointed out, regardless of the mode of data preparation: synchronous with the main tone of speech sounds or asynchronous. A method for analyzing the glottal source based on a two-level autoregressive model of the speech signal is proposed. Its software implementation based on the high-speed Burg-Levinson computational procedure is described. It does not require synchronization of the sequence of observations used with the main tone of the speech signal and is characterized by a relatively small amount of computational costs. Using the described software implementation, a full-scale experiment was set up and conducted, where the vowel sounds of the control speaker’s speech were used as the object of study. Based on the results of the experiment, the increased performance of the proposed method was confirmed and its requirements for the duration of the speech signal during voice analysis in real time were formulated. It is shown that the optimal duration is in the range from 32 to 128 ms. The results obtained can be used in the development and research of digital speech communication systems, voice control, biometrics, biomedicine and other speech systems where the voice characteristics of the speaker’s speech are of paramount importance.","PeriodicalId":14651,"journal":{"name":"Izmeritel`naya Tekhnika","volume":"195 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Izmeritel`naya Tekhnika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32446/0368-1025it.2024-2-55-62","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The task of analyzing a glottal source over a short observation interval is considered. The acute problem of insufficient performance of known methods for analyzing a glottal source is pointed out, regardless of the mode of data preparation: synchronous with the main tone of speech sounds or asynchronous. A method for analyzing the glottal source based on a two-level autoregressive model of the speech signal is proposed. Its software implementation based on the high-speed Burg-Levinson computational procedure is described. It does not require synchronization of the sequence of observations used with the main tone of the speech signal and is characterized by a relatively small amount of computational costs. Using the described software implementation, a full-scale experiment was set up and conducted, where the vowel sounds of the control speaker’s speech were used as the object of study. Based on the results of the experiment, the increased performance of the proposed method was confirmed and its requirements for the duration of the speech signal during voice analysis in real time were formulated. It is shown that the optimal duration is in the range from 32 to 128 ms. The results obtained can be used in the development and research of digital speech communication systems, voice control, biometrics, biomedicine and other speech systems where the voice characteristics of the speaker’s speech are of paramount importance.
基于语音信号两级自回归模型的声源异步分析方法
研究考虑了在较短观察间隔内分析声源的任务。研究指出,无论采用何种数据准备模式:与语音主音同步或非同步,已知的喉音源分析方法都存在性能不足的严重问题。本文提出了一种基于语音信号两级自回归模型的声源分析方法。介绍了基于高速 Burg-Levinson 计算程序的软件实现方法。该方法无需将观测序列与语音信号的主音同步,而且计算成本相对较低。利用所描述的软件实现,建立并进行了一次全面的实验,将对照组说话者语音中的元音作为研究对象。根据实验结果,确认了所提方法性能的提高,并制定了实时语音分析过程中对语音信号持续时间的要求。结果表明,最佳持续时间范围为 32 至 128 毫秒。所获得的结果可用于数字语音通信系统、语音控制、生物识别、生物医学和其他语音系统的开发和研究,在这些系统中,说话者的语音特征至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信