Affinity Mixup for Weakly Supervised Sound Event Detection

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2021-06-21 DOI:10.1109/mlsp52302.2021.9596270

M. Izadi, R. Stevenson, L. Kloepper

引用次数: 1

Abstract

The weakly supervised sound event detection (WSSED) problem is the task of predicting the presence of sound events and their corresponding starting and ending points in a weakly labeled dataset. A weak dataset associates each training sample (a short recording) to one or more present sources. Networks that solely rely on convolutional and recurrent layers cannot directly relate multiple frames in a recording. Motivated by attention and graph neural networks, we introduce the concept of an affinity mixup (AM) to incorporate time-level similarities and make a connection between frames. This regularization technique mixes up features in different layers using an adaptive affinity matrix. Our proposed affinity mixup network (AMN) improves over state-of-the-art techniques event-F1 scores by 8.2%.

查看原文本刊更多论文

弱监督声音事件检测的亲和混淆

弱监督声音事件检测(WSSED)问题的任务是在弱标记数据集中预测声音事件的存在及其相应的起始点和结束点。弱数据集将每个训练样本(短记录)关联到一个或多个当前源。仅依靠卷积层和循环层的网络不能直接关联记录中的多个帧。在注意力和图形神经网络的激励下，我们引入了亲和混合(AM)的概念，以合并时间级相似性并在帧之间建立连接。这种正则化技术使用自适应亲和矩阵混合不同层中的特征。我们提出的亲和混合网络(AMN)比最先进的技术事件f1得分提高了8.2%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP)

自引率

0.00%

发文量