{"title":"声道共振频率轨迹评估","authors":"A. S. Leonov, V. N. Sorokin","doi":"10.1134/S1063771023601140","DOIUrl":null,"url":null,"abstract":"<div><p>A new method for estimating formant frequency tracks of the vocal tract for arbitrary speech segments is proposed. The method uses the ratio of two Fourier transforms of a speech signal with special exponential-type windows depending on some parameter. This ratio is used for specific points in time and is considered as a function of frequency and parameter. By analyzing, for several parameter values, the distribution of minimum points (in terms of frequency) for the phase of this ratio and/or a similar distribution of extreme points for its amplitude, it is possible to estimate formant frequencies from the peaks of these distributions. A mathematical study is presented that substantiates this approach. A series of numerical experiments were carried out on the processing of synthetic and real speech signals, which confirmed the performance capabilities of the proposed formant evaluation method. In particular, in experiments with synthesized vowels, it was found that the error in estimating their resonance frequencies is small and stable with respect to additive noise up to a signal-to-noise ratio of 5 dB. For real speech, the method makes it possible to calculate the formant frequency tracks for both sounds with vocal excitation and for voiceless fricatives, aspirated plosives, and whispered speech.</p></div>","PeriodicalId":455,"journal":{"name":"Acoustical Physics","volume":"69 6","pages":"871 - 883"},"PeriodicalIF":0.9000,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessment of Tracks of Resonance Frequencies of the Vocal Tract\",\"authors\":\"A. S. Leonov, V. N. Sorokin\",\"doi\":\"10.1134/S1063771023601140\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>A new method for estimating formant frequency tracks of the vocal tract for arbitrary speech segments is proposed. The method uses the ratio of two Fourier transforms of a speech signal with special exponential-type windows depending on some parameter. This ratio is used for specific points in time and is considered as a function of frequency and parameter. By analyzing, for several parameter values, the distribution of minimum points (in terms of frequency) for the phase of this ratio and/or a similar distribution of extreme points for its amplitude, it is possible to estimate formant frequencies from the peaks of these distributions. A mathematical study is presented that substantiates this approach. A series of numerical experiments were carried out on the processing of synthetic and real speech signals, which confirmed the performance capabilities of the proposed formant evaluation method. In particular, in experiments with synthesized vowels, it was found that the error in estimating their resonance frequencies is small and stable with respect to additive noise up to a signal-to-noise ratio of 5 dB. For real speech, the method makes it possible to calculate the formant frequency tracks for both sounds with vocal excitation and for voiceless fricatives, aspirated plosives, and whispered speech.</p></div>\",\"PeriodicalId\":455,\"journal\":{\"name\":\"Acoustical Physics\",\"volume\":\"69 6\",\"pages\":\"871 - 883\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2024-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acoustical Physics\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://link.springer.com/article/10.1134/S1063771023601140\",\"RegionNum\":4,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acoustical Physics","FirstCategoryId":"101","ListUrlMain":"https://link.springer.com/article/10.1134/S1063771023601140","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0
摘要
本文提出了一种估算任意语音片段声道心形频率轨迹的新方法。该方法使用语音信号的两个傅立叶变换的比值,并根据某些参数使用特殊的指数型窗口。该比率用于特定的时间点,并被视为频率和参数的函数。通过分析几个参数值,该比率相位的最小点(频率)分布和/或其振幅的极值点的类似分布,可以根据这些分布的峰值估算出声母频率。本文提出的数学研究证实了这一方法。对合成和真实语音信号的处理进行了一系列数值实验,证实了所提出的声像评估方法的性能。特别是在合成元音的实验中发现,在信噪比不超过 5 dB 的情况下,估计元音共振频率的误差很小,而且相对于加性噪声来说很稳定。对于真实语音,该方法可以计算出带有发声激励的声音、无声摩擦音、吸气复音和耳语的共振频率轨迹。
Assessment of Tracks of Resonance Frequencies of the Vocal Tract
A new method for estimating formant frequency tracks of the vocal tract for arbitrary speech segments is proposed. The method uses the ratio of two Fourier transforms of a speech signal with special exponential-type windows depending on some parameter. This ratio is used for specific points in time and is considered as a function of frequency and parameter. By analyzing, for several parameter values, the distribution of minimum points (in terms of frequency) for the phase of this ratio and/or a similar distribution of extreme points for its amplitude, it is possible to estimate formant frequencies from the peaks of these distributions. A mathematical study is presented that substantiates this approach. A series of numerical experiments were carried out on the processing of synthetic and real speech signals, which confirmed the performance capabilities of the proposed formant evaluation method. In particular, in experiments with synthesized vowels, it was found that the error in estimating their resonance frequencies is small and stable with respect to additive noise up to a signal-to-noise ratio of 5 dB. For real speech, the method makes it possible to calculate the formant frequency tracks for both sounds with vocal excitation and for voiceless fricatives, aspirated plosives, and whispered speech.
期刊介绍:
Acoustical Physics is an international peer reviewed journal published with the participation of the Russian Academy of Sciences. It covers theoretical and experimental aspects of basic and applied acoustics: classical problems of linear acoustics and wave theory; nonlinear acoustics; physical acoustics; ocean acoustics and hydroacoustics; atmospheric and aeroacoustics; acoustics of structurally inhomogeneous solids; geological acoustics; acoustical ecology, noise and vibration; chamber acoustics, musical acoustics; acoustic signals processing, computer simulations; acoustics of living systems, biomedical acoustics; physical principles of engineering acoustics. The journal publishes critical reviews, original articles, short communications, and letters to the editor. It covers theoretical and experimental aspects of basic and applied acoustics. The journal welcomes manuscripts from all countries in the English or Russian language.