用相图训练的神经网络识别声源节点位置

A. Copiaco, C. Ritz, Stefano Fasciani, N. Abdulaziz
{"title":"用相图训练的神经网络识别声源节点位置","authors":"A. Copiaco, C. Ritz, Stefano Fasciani, N. Abdulaziz","doi":"10.1109/ISSPIT51521.2020.9408643","DOIUrl":null,"url":null,"abstract":"In this work, the best approximation of the sound source location through neural networks is examined. Majority of related work either omits the phase information from the Short Time Fourier Transform (STFT), or uses it for the sole purpose of restoring irregularities in spectrograms. Our process differ, such that it focuses on the phase component of the STFT coefficients to estimate the sound source location by classifying the closest microphone array (node). The image resulting from the mapping of the phase differences information within the time-frequency domain results in what we call phasograms, and are used as inputs to the neural network. Experimentation is achieved through recordings of the first four nodes of the SINS database. For this work, phase difference across adjacent microphones, as well as against the first microphone, were examined. Within a five-fold cross validation, this resulted in an F1-score of 99.68% for the former, and 99.31% for the latter. A real world application for our work are healthcare monitoring systems, when integrated with a sound scene classification system.","PeriodicalId":111385,"journal":{"name":"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Identifying Sound Source Node Locations Using Neural Networks Trained with Phasograms\",\"authors\":\"A. Copiaco, C. Ritz, Stefano Fasciani, N. Abdulaziz\",\"doi\":\"10.1109/ISSPIT51521.2020.9408643\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, the best approximation of the sound source location through neural networks is examined. Majority of related work either omits the phase information from the Short Time Fourier Transform (STFT), or uses it for the sole purpose of restoring irregularities in spectrograms. Our process differ, such that it focuses on the phase component of the STFT coefficients to estimate the sound source location by classifying the closest microphone array (node). The image resulting from the mapping of the phase differences information within the time-frequency domain results in what we call phasograms, and are used as inputs to the neural network. Experimentation is achieved through recordings of the first four nodes of the SINS database. For this work, phase difference across adjacent microphones, as well as against the first microphone, were examined. Within a five-fold cross validation, this resulted in an F1-score of 99.68% for the former, and 99.31% for the latter. A real world application for our work are healthcare monitoring systems, when integrated with a sound scene classification system.\",\"PeriodicalId\":111385,\"journal\":{\"name\":\"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSPIT51521.2020.9408643\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPIT51521.2020.9408643","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在这项工作中,研究了通过神经网络对声源位置的最佳逼近。大多数相关工作要么忽略短时傅里叶变换(STFT)中的相位信息,要么将其仅用于恢复频谱图中的不规则性。我们的过程有所不同,因此它侧重于STFT系数的相位分量,通过对最近的麦克风阵列(节点)进行分类来估计声源位置。由时频域中相位差信息的映射得到的图像就是我们所说的相位图,并被用作神经网络的输入。实验是通过记录SINS数据库的前四个节点实现的。在这项工作中,检查了相邻麦克风之间以及与第一个麦克风之间的相位差。在五重交叉验证中,前者的f1得分为99.68%,后者为99.31%。我们工作的一个实际应用是医疗监控系统,当与声音场景分类系统集成时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Identifying Sound Source Node Locations Using Neural Networks Trained with Phasograms
In this work, the best approximation of the sound source location through neural networks is examined. Majority of related work either omits the phase information from the Short Time Fourier Transform (STFT), or uses it for the sole purpose of restoring irregularities in spectrograms. Our process differ, such that it focuses on the phase component of the STFT coefficients to estimate the sound source location by classifying the closest microphone array (node). The image resulting from the mapping of the phase differences information within the time-frequency domain results in what we call phasograms, and are used as inputs to the neural network. Experimentation is achieved through recordings of the first four nodes of the SINS database. For this work, phase difference across adjacent microphones, as well as against the first microphone, were examined. Within a five-fold cross validation, this resulted in an F1-score of 99.68% for the former, and 99.31% for the latter. A real world application for our work are healthcare monitoring systems, when integrated with a sound scene classification system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信