{"title":"Perceptually-Driven Scalable MDCT Enhancement of Compressed Audio Based on Statistical Conversion","authors":"D. Cantzos, A. Mouchtaris, C. Kyriakakis","doi":"10.1109/ISM.2011.16","DOIUrl":null,"url":null,"abstract":"Many state-of-the-art audio codecs operating in a transform domain provide scalability as a core function by allowing to selectively subtract bits -- usually according to a nonperceptual criterion from the full bit rate data stream. This work presents a different, or even reverse, scalability approach in which a scalable codec can selectively add perceptually significant bits to a low bit rate data stream. The scalable enhancement algorithm presented here operates in the Modified Discrete Cosine Transform domain, which is popular among perceptual audio transform encoders, but its extension on other domains is straightforward. By exploiting the information of an existing low bit rate base layer, the algorithm adds perceptually significant data to the data stream according to a psycho acoustic model, and improves the audio quality at a fraction of the bit rate that would normally be required for the encoding or transmission of the whole audio piece of the same quality. Applications of this can be found in packet retransmission schemes of compressed audio networks and in remote audio enhancement.","PeriodicalId":339410,"journal":{"name":"2011 IEEE International Symposium on Multimedia","volume":"421 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Symposium on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISM.2011.16","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Many state-of-the-art audio codecs operating in a transform domain provide scalability as a core function by allowing to selectively subtract bits -- usually according to a nonperceptual criterion from the full bit rate data stream. This work presents a different, or even reverse, scalability approach in which a scalable codec can selectively add perceptually significant bits to a low bit rate data stream. The scalable enhancement algorithm presented here operates in the Modified Discrete Cosine Transform domain, which is popular among perceptual audio transform encoders, but its extension on other domains is straightforward. By exploiting the information of an existing low bit rate base layer, the algorithm adds perceptually significant data to the data stream according to a psycho acoustic model, and improves the audio quality at a fraction of the bit rate that would normally be required for the encoding or transmission of the whole audio piece of the same quality. Applications of this can be found in packet retransmission schemes of compressed audio networks and in remote audio enhancement.