Expressive speech analysis for epoch extraction using zero frequency filtering approach

D. Pravena, D. Govind
{"title":"Expressive speech analysis for epoch extraction using zero frequency filtering approach","authors":"D. Pravena, D. Govind","doi":"10.1109/TECHSYM.2016.7872689","DOIUrl":null,"url":null,"abstract":"The present work discusses the issues of epoch extraction from expressive speech signals. Epochs represent the accurate glottal closure instants in voiced speech which in turn give the accurate instants of maximum excitation of the vocal tract. Even though, there are many existing methods for epoch extraction, which provide near perfect epoch estimation from clean or neutral speech, these methods show significant drop in the epoch extraction performance for expressive speech signals. The occurrence of uncontrolled and rapid pitch variations in expressive speech signals cause degradation in the epoch extraction performance. The objective of the present work is to improve the epoch extraction performance of the speech signals with various perceptually distinct expressions compared to neutral speech using zero frequency filtering (ZFF) approach. In order to capture the rapid and uncontrolled variations in expressive speech utterances, trend removal is performed on short segments (25 ms) of the output obtained from the cascade of three zero frequency resonators (ZFR). The epoch estimation performance of the proposed method is compared with the conventional ZFF method, existing refined ZFF method proposed for expressive speech and recently proposed zero band filtering (ZBF) approach. The effectiveness of the approach is confirmed by the improved epoch identification rate and reduced miss and false alarm rates compared with that of the existing methods.","PeriodicalId":403350,"journal":{"name":"2016 IEEE Students’ Technology Symposium (TechSym)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Students’ Technology Symposium (TechSym)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TECHSYM.2016.7872689","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

The present work discusses the issues of epoch extraction from expressive speech signals. Epochs represent the accurate glottal closure instants in voiced speech which in turn give the accurate instants of maximum excitation of the vocal tract. Even though, there are many existing methods for epoch extraction, which provide near perfect epoch estimation from clean or neutral speech, these methods show significant drop in the epoch extraction performance for expressive speech signals. The occurrence of uncontrolled and rapid pitch variations in expressive speech signals cause degradation in the epoch extraction performance. The objective of the present work is to improve the epoch extraction performance of the speech signals with various perceptually distinct expressions compared to neutral speech using zero frequency filtering (ZFF) approach. In order to capture the rapid and uncontrolled variations in expressive speech utterances, trend removal is performed on short segments (25 ms) of the output obtained from the cascade of three zero frequency resonators (ZFR). The epoch estimation performance of the proposed method is compared with the conventional ZFF method, existing refined ZFF method proposed for expressive speech and recently proposed zero band filtering (ZBF) approach. The effectiveness of the approach is confirmed by the improved epoch identification rate and reduced miss and false alarm rates compared with that of the existing methods.
用零频率滤波方法提取历元的表达语音分析
本文讨论了从表达性语音信号中提取历元的问题。声门闭合的时间点代表了发声时声门关闭的准确时间点,声道关闭的时间点又给出了声道最大兴奋的准确时间点。尽管已有许多epoch提取方法可以从干净或中性语音中提供接近完美的epoch估计,但这些方法对表达性语音信号的epoch提取性能明显下降。表达性语音信号中出现不受控制的快速音高变化会导致历元提取性能下降。本研究的目的是利用零频率滤波(zero frequency filtering, ZFF)方法,提高具有各种感知上不同表达的语音信号的历元提取性能。为了捕捉表达性语音的快速和不受控制的变化,对从三个零频率谐振器(ZFR)级联获得的输出的短段(25毫秒)进行趋势去除。将该方法的历元估计性能与传统的ZFF方法、现有针对表达性语音提出的改进ZFF方法以及最近提出的零带滤波(ZBF)方法进行比较。与现有方法相比,提高了历元识别率,降低了漏报率和虚警率,验证了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信