声音事件检测使用类激活映射

Jakub Bajzik, R. Jarina
{"title":"声音事件检测使用类激活映射","authors":"Jakub Bajzik, R. Jarina","doi":"10.1109/ELEKTRO53996.2022.9803350","DOIUrl":null,"url":null,"abstract":"In this paper, we present the system for sound event detection in domestic environments as defined in the DCASE 2021 challenge Task 4. The task aims to provide audio event localization timestamps in addition to event class probabilities. We aim to explore the usage of class activation maps, known from image processing, in such sound event detection systems. We propose two systems. The first system is a convolutional neural network trained for sound event classification using only weakly labeled and unlabeled data. The strong labels are obtained using class activation mapping, which is a popular technique, especially in image processing. In the second proposed system, we modified the baseline system, provided by the DCASE organizers, in which we added the class activation mapping as a part of the attention mechanism. The experimental results show that the class activation maps enable improvement of the system performance in comparison with the baseline.","PeriodicalId":396752,"journal":{"name":"2022 ELEKTRO (ELEKTRO)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sound event detection using class activation maps\",\"authors\":\"Jakub Bajzik, R. Jarina\",\"doi\":\"10.1109/ELEKTRO53996.2022.9803350\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present the system for sound event detection in domestic environments as defined in the DCASE 2021 challenge Task 4. The task aims to provide audio event localization timestamps in addition to event class probabilities. We aim to explore the usage of class activation maps, known from image processing, in such sound event detection systems. We propose two systems. The first system is a convolutional neural network trained for sound event classification using only weakly labeled and unlabeled data. The strong labels are obtained using class activation mapping, which is a popular technique, especially in image processing. In the second proposed system, we modified the baseline system, provided by the DCASE organizers, in which we added the class activation mapping as a part of the attention mechanism. The experimental results show that the class activation maps enable improvement of the system performance in comparison with the baseline.\",\"PeriodicalId\":396752,\"journal\":{\"name\":\"2022 ELEKTRO (ELEKTRO)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 ELEKTRO (ELEKTRO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ELEKTRO53996.2022.9803350\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 ELEKTRO (ELEKTRO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ELEKTRO53996.2022.9803350","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在本文中,我们介绍了DCASE 2021挑战任务4中定义的家庭环境中声音事件检测系统。该任务旨在提供音频事件定位时间戳以及事件类概率。我们的目标是探索从图像处理中已知的类激活图在这种声音事件检测系统中的使用。我们提出两种制度。第一个系统是一个卷积神经网络,只使用弱标记和未标记的数据进行声音事件分类训练。使用类激活映射获得强标签,这是一种流行的技术,特别是在图像处理中。在第二个提议的系统中,我们修改了由DCASE组织者提供的基线系统,在其中我们添加了类激活映射作为注意力机制的一部分。实验结果表明,与基线相比,类激活映射能够提高系统的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Sound event detection using class activation maps
In this paper, we present the system for sound event detection in domestic environments as defined in the DCASE 2021 challenge Task 4. The task aims to provide audio event localization timestamps in addition to event class probabilities. We aim to explore the usage of class activation maps, known from image processing, in such sound event detection systems. We propose two systems. The first system is a convolutional neural network trained for sound event classification using only weakly labeled and unlabeled data. The strong labels are obtained using class activation mapping, which is a popular technique, especially in image processing. In the second proposed system, we modified the baseline system, provided by the DCASE organizers, in which we added the class activation mapping as a part of the attention mechanism. The experimental results show that the class activation maps enable improvement of the system performance in comparison with the baseline.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信