基于新下混方案的ITU-T G.722参数立体声扩展

Thi Minh Nguyet Hoang, S. Ragot, Balázs Kövesi, P. Scalart
{"title":"基于新下混方案的ITU-T G.722参数立体声扩展","authors":"Thi Minh Nguyet Hoang, S. Ragot, Balázs Kövesi, P. Scalart","doi":"10.1109/MMSP.2010.5662017","DOIUrl":null,"url":null,"abstract":"In this paper, we present a novel, frequency-domain stereo to mono downmixing, which preserves the energy of spectral components and avoids setting the left or right channel as a phase reference. Based on this downmixing technique, a parametric stereo analysis-synthesis model is described in which subband stereo parameters consist of interchannel level differences and phase differences between the mono signal and one of the stereo channels (left or right). This model is applied to the stereo extension of ITU-T G.722 at 56+8 and 64+16 kbit/s with a frame length of 5 ms. AB test results are provided to assess the quality of the proposed downmixing technique. In addition, the quality of the proposed G.722-based stereo coder is compared against reference coders (G.722.1 at 24 and 32 kbit/s dual mono and G.722 at 64 kbit/s dual mono) for clean speech, noisy speech and music.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Parametric stereo extension of ITU-T G.722 based on a new downmixing scheme\",\"authors\":\"Thi Minh Nguyet Hoang, S. Ragot, Balázs Kövesi, P. Scalart\",\"doi\":\"10.1109/MMSP.2010.5662017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a novel, frequency-domain stereo to mono downmixing, which preserves the energy of spectral components and avoids setting the left or right channel as a phase reference. Based on this downmixing technique, a parametric stereo analysis-synthesis model is described in which subband stereo parameters consist of interchannel level differences and phase differences between the mono signal and one of the stereo channels (left or right). This model is applied to the stereo extension of ITU-T G.722 at 56+8 and 64+16 kbit/s with a frame length of 5 ms. AB test results are provided to assess the quality of the proposed downmixing technique. In addition, the quality of the proposed G.722-based stereo coder is compared against reference coders (G.722.1 at 24 and 32 kbit/s dual mono and G.722 at 64 kbit/s dual mono) for clean speech, noisy speech and music.\",\"PeriodicalId\":105774,\"journal\":{\"name\":\"2010 IEEE International Workshop on Multimedia Signal Processing\",\"volume\":\"83 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-12-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE International Workshop on Multimedia Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMSP.2010.5662017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Workshop on Multimedia Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP.2010.5662017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

在本文中,我们提出了一种新颖的频域立体声到单声道的下混,它保留了频谱分量的能量,并且避免了将左通道或右通道作为相位参考。基于这种下混技术,描述了一种参数立体声分析-合成模型,其中子带立体声参数由单声道信号与其中一个立体声通道(左或右)之间的声道间电平差和相位差组成。该模型适用于ITU-T G.722以56+8和64+16 kbit/s的帧长为5 ms的立体声扩展。给出了AB测试结果来评估所提出的下混技术的质量。此外,将所提出的基于G.722的立体声编码器的质量与参考编码器(24和32 kbit/s双单声道的G.722.1和64 kbit/s双单声道的G.722)进行了比较,用于清洁语音,嘈杂语音和音乐。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Parametric stereo extension of ITU-T G.722 based on a new downmixing scheme
In this paper, we present a novel, frequency-domain stereo to mono downmixing, which preserves the energy of spectral components and avoids setting the left or right channel as a phase reference. Based on this downmixing technique, a parametric stereo analysis-synthesis model is described in which subband stereo parameters consist of interchannel level differences and phase differences between the mono signal and one of the stereo channels (left or right). This model is applied to the stereo extension of ITU-T G.722 at 56+8 and 64+16 kbit/s with a frame length of 5 ms. AB test results are provided to assess the quality of the proposed downmixing technique. In addition, the quality of the proposed G.722-based stereo coder is compared against reference coders (G.722.1 at 24 and 32 kbit/s dual mono and G.722 at 64 kbit/s dual mono) for clean speech, noisy speech and music.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信