Multi-channel audio signal retrieval based on multi-factor data mining with tensor decomposition

2014 19th International Conference on Digital Signal Processing Pub Date : 2014-09-18 DOI:10.1109/ICDSP.2014.6900766

Yi Zhao, Jing Wang, Lidong Yang, Ali Imtiaz, Jingming Kuang

{"title":"Multi-channel audio signal retrieval based on multi-factor data mining with tensor decomposition","authors":"Yi Zhao, Jing Wang, Lidong Yang, Ali Imtiaz, Jingming Kuang","doi":"10.1109/ICDSP.2014.6900766","DOIUrl":null,"url":null,"abstract":"In this paper, an efficient method of multi-channel audio signal retrieval with finite number of channels is proposed based on multi-factor data mining with tensor decomposition. We briefly discuss how to convert the limited channels to an increased number of channels (multi-channel) by capturing the latent higher-order tensor structure of multi-channel audio data. The multi-channel audio data space is established mainly due to three factors including location, channel and time-frequency. Moreover, CANDECOMP/PARAFAC (CP) decomposition is introduced in the process of multi-factor data mining to predict the data in the missing channels. Besides, considering human auditory effects at low frequency, we compute a set of data in advance for the retrieval of Low Frequency Effects (LFE) channel. The performance of the proposed method is assessed by MUlti-Stimulus test with Hidden References and Anchor listening test (MUSHRA). We further demonstrate the retrieval of 5.1 multi-channel audio from stereo audio. Experiments show that an acceptable converting quality has been achieved and the novel tensor-based method is easy to implement as compared to the traditional method.","PeriodicalId":301856,"journal":{"name":"2014 19th International Conference on Digital Signal Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 19th International Conference on Digital Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSP.2014.6900766","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In this paper, an efficient method of multi-channel audio signal retrieval with finite number of channels is proposed based on multi-factor data mining with tensor decomposition. We briefly discuss how to convert the limited channels to an increased number of channels (multi-channel) by capturing the latent higher-order tensor structure of multi-channel audio data. The multi-channel audio data space is established mainly due to three factors including location, channel and time-frequency. Moreover, CANDECOMP/PARAFAC (CP) decomposition is introduced in the process of multi-factor data mining to predict the data in the missing channels. Besides, considering human auditory effects at low frequency, we compute a set of data in advance for the retrieval of Low Frequency Effects (LFE) channel. The performance of the proposed method is assessed by MUlti-Stimulus test with Hidden References and Anchor listening test (MUSHRA). We further demonstrate the retrieval of 5.1 multi-channel audio from stereo audio. Experiments show that an acceptable converting quality has been achieved and the novel tensor-based method is easy to implement as compared to the traditional method.

查看原文本刊更多论文

基于张量分解的多因素数据挖掘的多通道音频信号检索

本文提出了一种基于张量分解的多因素数据挖掘的有限信道多通道音频信号检索方法。我们简要讨论了如何通过捕获多通道音频数据的潜在高阶张量结构将有限的通道转换为增加的通道(多通道)。多通道音频数据空间的建立主要取决于位置、通道和时频三个因素。此外，在多因素数据挖掘过程中引入了CANDECOMP/PARAFAC (CP)分解来预测缺失通道中的数据。此外，考虑到人的低频听觉效应，我们提前计算了一组数据，用于检索低频效应(LFE)通道。通过隐含参考的多刺激测试和锚点听力测试(MUSHRA)对该方法的性能进行了评估。我们进一步演示了从立体声音频中检索5.1多声道音频。实验表明，该方法具有较好的转换质量，且与传统方法相比，该方法易于实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 19th International Conference on Digital Signal Processing

自引率

0.00%

发文量