基于回声状态网络的互联网语音质量实时估计

Sebastián Basterrech, G. Rubino
{"title":"基于回声状态网络的互联网语音质量实时估计","authors":"Sebastián Basterrech, G. Rubino","doi":"10.7763/JACN.2013.V1.37","DOIUrl":null,"url":null,"abstract":"Audio quality in the Internet can be strongly affected by network conditions. As a consequence, many techniques to evaluate it have been developed. In particular, the ITU-T adopted in 2001 a technique called Perceptual Evaluation of Speech Quality (PESQ) to automatically measuring speech quality. PESQ is a well-known and widely used procedure, providing in general an accurate evaluation of perceptual quality by comparing the original and received voice sequences. One obvious inherent limitation of PESQ is, thus, that it requires the original signal (we say the reference), to make its evaluation. This precludes the use of PESQ for assessing the perceived quality in real-time, as the reference is in general not available. In this paper, we describe a procedure for estimating PESQ output working only with measures taken on the network state and properties of the communication system, without any use of the reference. It is based on the use of statistical learning techniques. Specifically, we rely on recent ideas for learning with specific types of neural networks, known under the name of Echo State Networks (ESNs), a member of the class of Reservoir Computing systems. These tools have been proven to be very efficient and robust in many learning tasks. The experimental results obtained show the good accuracy of the resulting procedure, and its capability to give its estimations of speech quality in a real-time context. This allows putting our measuring modules in future Internet applications or services based on voice transmission, for instance for control purposes.","PeriodicalId":232851,"journal":{"name":"Journal of Advances in Computer Networks","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Real-Time Estimation of Speech Quality through the Internet Using Echo State Networks\",\"authors\":\"Sebastián Basterrech, G. Rubino\",\"doi\":\"10.7763/JACN.2013.V1.37\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Audio quality in the Internet can be strongly affected by network conditions. As a consequence, many techniques to evaluate it have been developed. In particular, the ITU-T adopted in 2001 a technique called Perceptual Evaluation of Speech Quality (PESQ) to automatically measuring speech quality. PESQ is a well-known and widely used procedure, providing in general an accurate evaluation of perceptual quality by comparing the original and received voice sequences. One obvious inherent limitation of PESQ is, thus, that it requires the original signal (we say the reference), to make its evaluation. This precludes the use of PESQ for assessing the perceived quality in real-time, as the reference is in general not available. In this paper, we describe a procedure for estimating PESQ output working only with measures taken on the network state and properties of the communication system, without any use of the reference. It is based on the use of statistical learning techniques. Specifically, we rely on recent ideas for learning with specific types of neural networks, known under the name of Echo State Networks (ESNs), a member of the class of Reservoir Computing systems. These tools have been proven to be very efficient and robust in many learning tasks. The experimental results obtained show the good accuracy of the resulting procedure, and its capability to give its estimations of speech quality in a real-time context. This allows putting our measuring modules in future Internet applications or services based on voice transmission, for instance for control purposes.\",\"PeriodicalId\":232851,\"journal\":{\"name\":\"Journal of Advances in Computer Networks\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Advances in Computer Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.7763/JACN.2013.V1.37\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advances in Computer Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7763/JACN.2013.V1.37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

互联网上的音频质量会受到网络条件的强烈影响。因此,开发了许多评估它的技术。特别是,ITU-T在2001年采用了语音质量感知评价(PESQ)技术来自动测量语音质量。PESQ是一个众所周知且广泛使用的程序,通常通过比较原始和接收的语音序列来提供对感知质量的准确评估。因此,PESQ的一个明显的固有限制是,它需要原始信号(我们称之为参考信号)来进行评估。这就排除了使用PESQ实时评估感知质量的可能性,因为通常无法获得参考资料。在本文中,我们描述了一个估计PESQ输出的过程,该过程仅使用对通信系统的网络状态和属性采取的措施,而不使用任何参考。它基于统计学习技术的使用。具体来说,我们依赖于使用特定类型的神经网络进行学习的最新想法,这些神经网络被称为回声状态网络(Echo State networks, ESNs),是水库计算系统的一员。这些工具已被证明在许多学习任务中非常有效和健壮。实验结果表明,该方法具有良好的准确性,能够在实时环境下对语音质量进行估计。这使得我们的测量模块可以用于未来基于语音传输的互联网应用或服务,例如用于控制目的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Real-Time Estimation of Speech Quality through the Internet Using Echo State Networks
Audio quality in the Internet can be strongly affected by network conditions. As a consequence, many techniques to evaluate it have been developed. In particular, the ITU-T adopted in 2001 a technique called Perceptual Evaluation of Speech Quality (PESQ) to automatically measuring speech quality. PESQ is a well-known and widely used procedure, providing in general an accurate evaluation of perceptual quality by comparing the original and received voice sequences. One obvious inherent limitation of PESQ is, thus, that it requires the original signal (we say the reference), to make its evaluation. This precludes the use of PESQ for assessing the perceived quality in real-time, as the reference is in general not available. In this paper, we describe a procedure for estimating PESQ output working only with measures taken on the network state and properties of the communication system, without any use of the reference. It is based on the use of statistical learning techniques. Specifically, we rely on recent ideas for learning with specific types of neural networks, known under the name of Echo State Networks (ESNs), a member of the class of Reservoir Computing systems. These tools have been proven to be very efficient and robust in many learning tasks. The experimental results obtained show the good accuracy of the resulting procedure, and its capability to give its estimations of speech quality in a real-time context. This allows putting our measuring modules in future Internet applications or services based on voice transmission, for instance for control purposes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信