基于卷积受限玻尔兹曼机的音频分类特征提取

2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2015-11-01 DOI:10.1109/ACPR.2015.7486611

Min Li, Z. Miao, Cong Ma

{"title":"基于卷积受限玻尔兹曼机的音频分类特征提取","authors":"Min Li, Z. Miao, Cong Ma","doi":"10.1109/ACPR.2015.7486611","DOIUrl":null,"url":null,"abstract":"Feature extraction is a crucial part for a large number of audio tasks. Researchers have extracted audio features in multiple ways, among which some most recent methods are based on the hidden layer of a trained neutral network. In this paper, we present a system which can automatically extract features from unlabeled audio data, and then the features of extracted from the system are used for audio classification task. Ourfeature extraction scheme makes use of a convolutional restricted Boltzmann machine (CRBM), instead of those using restricted Boltzmann machines (RB-M). By using features extracted from CRBM, we can achieve about 7% accuracy improvement consistently over than the RBM-based features on the TI-Digits dataset for audio classification. We also combine the well-known MFCC features and the CRBM-based features in the form of a linear combination. In our experiments, this feature combining the two methods performs better than both features alone.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"158 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Feature extraction with convolutional restricted boltzmann machine for audio classification\",\"authors\":\"Min Li, Z. Miao, Cong Ma\",\"doi\":\"10.1109/ACPR.2015.7486611\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature extraction is a crucial part for a large number of audio tasks. Researchers have extracted audio features in multiple ways, among which some most recent methods are based on the hidden layer of a trained neutral network. In this paper, we present a system which can automatically extract features from unlabeled audio data, and then the features of extracted from the system are used for audio classification task. Ourfeature extraction scheme makes use of a convolutional restricted Boltzmann machine (CRBM), instead of those using restricted Boltzmann machines (RB-M). By using features extracted from CRBM, we can achieve about 7% accuracy improvement consistently over than the RBM-based features on the TI-Digits dataset for audio classification. We also combine the well-known MFCC features and the CRBM-based features in the form of a linear combination. In our experiments, this feature combining the two methods performs better than both features alone.\",\"PeriodicalId\":240902,\"journal\":{\"name\":\"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)\",\"volume\":\"158 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACPR.2015.7486611\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACPR.2015.7486611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

特征提取是大量音频任务的关键部分。研究人员已经用多种方法提取音频特征，其中一些最新的方法是基于训练好的神经网络的隐藏层。本文提出了一种从未标记音频数据中自动提取特征的系统，然后将提取的特征用于音频分类任务。我们的特征提取方案使用了卷积受限玻尔兹曼机(CRBM)，而不是使用受限玻尔兹曼机(RB-M)。通过使用从CRBM中提取的特征，我们可以比在TI-Digits数据集上基于rbm的特征持续提高约7%的音频分类准确率。我们还将众所周知的MFCC特征和基于crbm的特征以线性组合的形式结合起来。在我们的实验中，结合这两种方法的特征比单独使用这两种特征表现得更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Feature extraction with convolutional restricted boltzmann machine for audio classification

Feature extraction is a crucial part for a large number of audio tasks. Researchers have extracted audio features in multiple ways, among which some most recent methods are based on the hidden layer of a trained neutral network. In this paper, we present a system which can automatically extract features from unlabeled audio data, and then the features of extracted from the system are used for audio classification task. Ourfeature extraction scheme makes use of a convolutional restricted Boltzmann machine (CRBM), instead of those using restricted Boltzmann machines (RB-M). By using features extracted from CRBM, we can achieve about 7% accuracy improvement consistently over than the RBM-based features on the TI-Digits dataset for audio classification. We also combine the well-known MFCC features and the CRBM-based features in the form of a linear combination. In our experiments, this feature combining the two methods performs better than both features alone.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)

自引率

0.00%

发文量