语音叠加正弦模型的瞬态建模

2013 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2013-05-26 DOI:10.1109/ICASSP.2013.6639261

Slava Shechtman

{"title":"语音叠加正弦模型的瞬态建模","authors":"Slava Shechtman","doi":"10.1109/ICASSP.2013.6639261","DOIUrl":null,"url":null,"abstract":"Speech sinusoidal modeling has been successfully applied to a broad range of speech analysis, synthesis and modification tasks. At most, it reproduces a high quality speech, however for speech transients (e.g. plosives, glottal stops) it suffers from reduced fidelity due to lack of intra-frame modeling of irregularities. Various extensions had been proposed for the stationary sinusoidal model to cope with this problem. One of simple and well-known in the art approaches is incorporating of an intra-frame magnitude envelope into the sinusoidal model. It used to be done by iterative analysis-by-synthesis procedure. In this paper we derive an optimal analytic solution for this problem. We will show that this solution yields significantly better model fit than the known-in-the-art analysis-by-synthesis approach.","PeriodicalId":183968,"journal":{"name":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Transient modeling for overlap-add sinusoidal model of speech\",\"authors\":\"Slava Shechtman\",\"doi\":\"10.1109/ICASSP.2013.6639261\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech sinusoidal modeling has been successfully applied to a broad range of speech analysis, synthesis and modification tasks. At most, it reproduces a high quality speech, however for speech transients (e.g. plosives, glottal stops) it suffers from reduced fidelity due to lack of intra-frame modeling of irregularities. Various extensions had been proposed for the stationary sinusoidal model to cope with this problem. One of simple and well-known in the art approaches is incorporating of an intra-frame magnitude envelope into the sinusoidal model. It used to be done by iterative analysis-by-synthesis procedure. In this paper we derive an optimal analytic solution for this problem. We will show that this solution yields significantly better model fit than the known-in-the-art analysis-by-synthesis approach.\",\"PeriodicalId\":183968,\"journal\":{\"name\":\"2013 IEEE International Conference on Acoustics, Speech and Signal Processing\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-05-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE International Conference on Acoustics, Speech and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2013.6639261\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Conference on Acoustics, Speech and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2013.6639261","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

语音正弦建模已经成功地应用于广泛的语音分析、合成和修改任务。它最多可以再现高质量的语音，但是对于语音瞬态(例如爆破音、声门停顿音)，由于缺乏帧内不规则建模，它的保真度降低了。为了解决这一问题，人们对平稳正弦模型提出了各种扩展。一种简单而众所周知的方法是将帧内幅度包络纳入正弦模型。它过去是通过迭代的综合分析过程来完成的。本文给出了这一问题的最优解析解。我们将证明，该解决方案比已知的合成分析方法产生更好的模型拟合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Transient modeling for overlap-add sinusoidal model of speech

Speech sinusoidal modeling has been successfully applied to a broad range of speech analysis, synthesis and modification tasks. At most, it reproduces a high quality speech, however for speech transients (e.g. plosives, glottal stops) it suffers from reduced fidelity due to lack of intra-frame modeling of irregularities. Various extensions had been proposed for the stationary sinusoidal model to cope with this problem. One of simple and well-known in the art approaches is incorporating of an intra-frame magnitude envelope into the sinusoidal model. It used to be done by iterative analysis-by-synthesis procedure. In this paper we derive an optimal analytic solution for this problem. We will show that this solution yields significantly better model fit than the known-in-the-art analysis-by-synthesis approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE International Conference on Acoustics, Speech and Signal Processing

自引率

0.00%

发文量