Maximum Correntropy Linear Prediction for Voice Inverse Filtering: Theoretical Framework and Practical Implementation.

IEEE transactions on audio, speech, and language processing (2025) Pub Date : 2025-01-01 Epub Date: 2024-12-05 DOI:10.1109/taslp.2024.3512187

Iván A Zalazar, Gabriel A Alzamendi, Matías Zañartu, Gastón Schlotthauer

{"title":"Maximum Correntropy Linear Prediction for Voice Inverse Filtering: Theoretical Framework and Practical Implementation.","authors":"Iván A Zalazar, Gabriel A Alzamendi, Matías Zañartu, Gastón Schlotthauer","doi":"10.1109/taslp.2024.3512187","DOIUrl":null,"url":null,"abstract":"<p><p>Voice inverse filtering methods aim at noninvasively estimating the glottal source information from the voice signal. These inverse filtering strategies typically rely on parametric models and variants of linear prediction for tuning the vocal tract filter. Weighted linear prediction schemes have proved to be the best performing for inverse filtering applications. However, the linear prediction and its variants are sensitive to the impulse-like acoustic excitations triggered by the abrupt glottal closure during voiced phonation. The present study examines the maximum correntropy criterion-based linear prediction (MCLP) for voice inverse filtering. Correntropy is a nonlinear, localized similarity measure inherently insensitive to peak-like outliers. Here, a theoretical framework is established for studying the properties of correntropy relevant for voice inverse filtering and for developing an algorithm to estimate vocal tract filter coefficients. The proposed algorithm results in a robust weighted linear prediction, where a correntropy weighting function is adjusted iteratively by a data-driven optimization scheme. The effects of correntropy kernel parameters on the performance of the MCLP method are analyzed. Characterization of the MCLP method for voice inverse filtering is addressed based on synthetic and natural sustained vowel signals. Simulations show that MCLP naturally overweights samples in the glottal closed phase, where the phonation model is more accurate. MCLP does not require prior information about the glottal instants, nor applying a predefined weighting function. Results show that MCLP performs similarly or better than other well-established inverse filtering methods based on weighted linear prediction.</p>","PeriodicalId":520926,"journal":{"name":"IEEE transactions on audio, speech, and language processing (2025)","volume":"33 ","pages":"152-162"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12226812/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on audio, speech, and language processing (2025)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/taslp.2024.3512187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/5 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Voice inverse filtering methods aim at noninvasively estimating the glottal source information from the voice signal. These inverse filtering strategies typically rely on parametric models and variants of linear prediction for tuning the vocal tract filter. Weighted linear prediction schemes have proved to be the best performing for inverse filtering applications. However, the linear prediction and its variants are sensitive to the impulse-like acoustic excitations triggered by the abrupt glottal closure during voiced phonation. The present study examines the maximum correntropy criterion-based linear prediction (MCLP) for voice inverse filtering. Correntropy is a nonlinear, localized similarity measure inherently insensitive to peak-like outliers. Here, a theoretical framework is established for studying the properties of correntropy relevant for voice inverse filtering and for developing an algorithm to estimate vocal tract filter coefficients. The proposed algorithm results in a robust weighted linear prediction, where a correntropy weighting function is adjusted iteratively by a data-driven optimization scheme. The effects of correntropy kernel parameters on the performance of the MCLP method are analyzed. Characterization of the MCLP method for voice inverse filtering is addressed based on synthetic and natural sustained vowel signals. Simulations show that MCLP naturally overweights samples in the glottal closed phase, where the phonation model is more accurate. MCLP does not require prior information about the glottal instants, nor applying a predefined weighting function. Results show that MCLP performs similarly or better than other well-established inverse filtering methods based on weighted linear prediction.

查看原文本刊更多论文

语音反滤波的最大相关熵线性预测：理论框架与实践实现。

语音反滤波方法的目的是从语音信号中无创地估计声门源信息。这些反滤波策略通常依赖于参数模型和线性预测的变体来调整声道滤波器。加权线性预测方案在反滤波应用中表现最好。然而，线性预测及其变体对浊音过程中由声门突然关闭引起的脉冲声激励敏感。本研究探讨了语音反滤波中基于最大熵准则的线性预测方法。相关系数是一种非线性的局部相似度量，对峰状异常值不敏感。本文建立了一个理论框架，用于研究与语音反滤波相关的熵值特性，并开发了一种估计声道滤波系数的算法。该算法通过数据驱动优化方案迭代调整熵权函数，实现鲁棒加权线性预测。分析了熵核参数对MCLP方法性能的影响。研究了基于合成和自然持续元音信号的语音反滤波MCLP方法的特性。仿真结果表明，在声门闭合阶段，MCLP自然会使样本超重，此时发声模型更准确。MCLP不需要关于声门瞬间的先验信息，也不应用预定义的权重函数。结果表明，MCLP的性能与其他基于加权线性预测的反滤波方法相似或更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on audio, speech, and language processing (2025)

自引率

0.00%

发文量