基于去噪自编码器的鲁棒声音事件分类

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP) Pub Date : 2016-09-01 DOI:10.1109/MMSP.2016.7813376

Jianchao Zhou, Liqun Peng, Xiaoou Chen, Deshun Yang

{"title":"基于去噪自编码器的鲁棒声音事件分类","authors":"Jianchao Zhou, Liqun Peng, Xiaoou Chen, Deshun Yang","doi":"10.1109/MMSP.2016.7813376","DOIUrl":null,"url":null,"abstract":"Over the last decade, a lot of research has been done on sound event classification. But a main problem with sound event classification is that the performance sharply degrades in the presence of noise. As spectrogram-based image features and denoising auto encoder reportedly have superior performance in noisy conditions, this paper proposes a new robust feature called denoising auto encoder image feature (DIF) for sound event classification which is an image feature extracted from an image-like representation produced by denoising auto encoder. Performance of the feature is evaluated by a classification experiment using a SVM classifier on audio examples with different noise levels, and compared with that of baseline features including mel-frequency cepstral coefficients (MFCC) and spectrogram image feature. The proposed DIF demonstrates better performance under noise-corrupted conditions.","PeriodicalId":113192,"journal":{"name":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Robust sound event classification by using denoising autoencoder\",\"authors\":\"Jianchao Zhou, Liqun Peng, Xiaoou Chen, Deshun Yang\",\"doi\":\"10.1109/MMSP.2016.7813376\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the last decade, a lot of research has been done on sound event classification. But a main problem with sound event classification is that the performance sharply degrades in the presence of noise. As spectrogram-based image features and denoising auto encoder reportedly have superior performance in noisy conditions, this paper proposes a new robust feature called denoising auto encoder image feature (DIF) for sound event classification which is an image feature extracted from an image-like representation produced by denoising auto encoder. Performance of the feature is evaluated by a classification experiment using a SVM classifier on audio examples with different noise levels, and compared with that of baseline features including mel-frequency cepstral coefficients (MFCC) and spectrogram image feature. The proposed DIF demonstrates better performance under noise-corrupted conditions.\",\"PeriodicalId\":113192,\"journal\":{\"name\":\"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)\",\"volume\":\"106 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMSP.2016.7813376\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP.2016.7813376","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

在过去的十年中，人们对声事件分类进行了大量的研究。但是声事件分类的一个主要问题是，在存在噪声的情况下，其性能会急剧下降。鉴于基于谱图的图像特征和去噪自动编码器在噪声条件下具有优越的性能，本文提出了一种新的鲁棒特征——去噪自动编码器图像特征(DIF)，该特征是从去噪自动编码器产生的类图像表示中提取的图像特征。利用SVM分类器对不同噪声水平的音频样本进行分类实验，评价该特征的性能，并与mel-frequency倒谱系数(MFCC)和谱图图像特征等基线特征的性能进行比较。所提出的DIF在噪声干扰条件下表现出较好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Robust sound event classification by using denoising autoencoder

Over the last decade, a lot of research has been done on sound event classification. But a main problem with sound event classification is that the performance sharply degrades in the presence of noise. As spectrogram-based image features and denoising auto encoder reportedly have superior performance in noisy conditions, this paper proposes a new robust feature called denoising auto encoder image feature (DIF) for sound event classification which is an image feature extracted from an image-like representation produced by denoising auto encoder. Performance of the feature is evaluated by a classification experiment using a SVM classifier on audio examples with different noise levels, and compared with that of baseline features including mel-frequency cepstral coefficients (MFCC) and spectrogram image feature. The proposed DIF demonstrates better performance under noise-corrupted conditions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE 18th International Workshop on Multimedia Signal Processing (MMSP)

自引率

0.00%

发文量