Robust sound event classification by using denoising autoencoder

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI:10.1109/MMSP.2016.7813376

Jianchao Zhou, Liqun Peng, Xiaoou Chen, Deshun Yang

引用次数: 5

Abstract

Over the last decade, a lot of research has been done on sound event classification. But a main problem with sound event classification is that the performance sharply degrades in the presence of noise. As spectrogram-based image features and denoising auto encoder reportedly have superior performance in noisy conditions, this paper proposes a new robust feature called denoising auto encoder image feature (DIF) for sound event classification which is an image feature extracted from an image-like representation produced by denoising auto encoder. Performance of the feature is evaluated by a classification experiment using a SVM classifier on audio examples with different noise levels, and compared with that of baseline features including mel-frequency cepstral coefficients (MFCC) and spectrogram image feature. The proposed DIF demonstrates better performance under noise-corrupted conditions.

查看原文本刊更多论文

基于去噪自编码器的鲁棒声音事件分类

在过去的十年中，人们对声事件分类进行了大量的研究。但是声事件分类的一个主要问题是，在存在噪声的情况下，其性能会急剧下降。鉴于基于谱图的图像特征和去噪自动编码器在噪声条件下具有优越的性能，本文提出了一种新的鲁棒特征——去噪自动编码器图像特征(DIF)，该特征是从去噪自动编码器产生的类图像表示中提取的图像特征。利用SVM分类器对不同噪声水平的音频样本进行分类实验，评价该特征的性能，并与mel-frequency倒谱系数(MFCC)和谱图图像特征等基线特征的性能进行比较。所提出的DIF在噪声干扰条件下表现出较好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)

自引率

0.00%

发文量