Domain Adaptation Neural Network for Acoustic Scene Classification in Mismatched Conditions

Rui Wang, Mou Wang, Xiao-Lei Zhang, S. Rahardja
{"title":"Domain Adaptation Neural Network for Acoustic Scene Classification in Mismatched Conditions","authors":"Rui Wang, Mou Wang, Xiao-Lei Zhang, S. Rahardja","doi":"10.1109/APSIPAASC47483.2019.9023057","DOIUrl":null,"url":null,"abstract":"Acoustic scene classification is a task of predicting the acoustic environment of an audio recording. Because the training and test conditions in most real world acoustic scene classification problems do not match, it is strongly necessary to develop domain adaptation methods to solve the cross-domain problem. In this paper, we propose a domain adaptation neural network (DANN) based acoustic scene classification (ASC) method. Specifically, we first extract an acoustic feature, i.e. log-Mel spectrogram, which has been proven to be effective in previous studies. Then, we train a DANN to project the training and test domains into one common space where the acoustic scenes are categorized jointly. To boost the overall performance of the proposed method, we further train an ensemble of convolutional neural network (CNN) models with different parameter settings respectively. Finally, we fuse the DANN and CNN models by averaging the outputs of the models. We have evaluated the proposed method on the subtask B of task 1 of the DCASE 2019 ASC challenge, which is a closed-set classification problem whose audio recordings were recorded by mismatched devices. Experimental results demonstrate the effectiveness of the proposed method on the acoustic scene classification problem in mismatched conditions.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIPAASC47483.2019.9023057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Acoustic scene classification is a task of predicting the acoustic environment of an audio recording. Because the training and test conditions in most real world acoustic scene classification problems do not match, it is strongly necessary to develop domain adaptation methods to solve the cross-domain problem. In this paper, we propose a domain adaptation neural network (DANN) based acoustic scene classification (ASC) method. Specifically, we first extract an acoustic feature, i.e. log-Mel spectrogram, which has been proven to be effective in previous studies. Then, we train a DANN to project the training and test domains into one common space where the acoustic scenes are categorized jointly. To boost the overall performance of the proposed method, we further train an ensemble of convolutional neural network (CNN) models with different parameter settings respectively. Finally, we fuse the DANN and CNN models by averaging the outputs of the models. We have evaluated the proposed method on the subtask B of task 1 of the DCASE 2019 ASC challenge, which is a closed-set classification problem whose audio recordings were recorded by mismatched devices. Experimental results demonstrate the effectiveness of the proposed method on the acoustic scene classification problem in mismatched conditions.
不匹配条件下声学场景分类的领域自适应神经网络
声场景分类是一项预测录音声环境的任务。由于现实世界中大多数声场景分类问题的训练和测试条件并不匹配,因此开发领域自适应方法来解决跨领域问题是非常必要的。本文提出了一种基于领域自适应神经网络(DANN)的声学场景分类方法。具体来说,我们首先提取一个声学特征,即log-Mel谱图,这在之前的研究中已经被证明是有效的。然后,我们训练一个DANN,将训练域和测试域投影到一个共同的空间中,在这个空间中声学场景被联合分类。为了提高所提方法的整体性能,我们进一步训练了不同参数设置的卷积神经网络(CNN)模型集合。最后,我们通过平均模型的输出来融合DANN和CNN模型。我们在DCASE 2019 ASC挑战任务1的子任务B上评估了所提出的方法,该任务是一个封闭集分类问题,其音频记录是由不匹配的设备记录的。实验结果表明,该方法在不匹配条件下的声场景分类问题上是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信