Environmental sound recognition using time-frequency intersection patterns

Xuan Guo, Y. Toyoda, Huan Li, Jie Huang, Shuxue Ding, Yong Liu
{"title":"Environmental sound recognition using time-frequency intersection patterns","authors":"Xuan Guo, Y. Toyoda, Huan Li, Jie Huang, Shuxue Ding, Yong Liu","doi":"10.1155/2012/650818","DOIUrl":null,"url":null,"abstract":"Environmental sound recognition is an important function of robots and intelligent computer systems. In this research, we tried to use a multi-stage perceptron type neural network system for environmental sound recognition. The input data is the one-dimensional combination of instantaneous spectrum at power peak and the power pattern in time domain. Since for almost environmental sounds, their spectrum changes are not remarkable compared with speech or voice, the combination of power and frequency pattern will preserve the major features of environmental sounds but with drastically reduced data. Two experiments were conducted using an original database and a database created by the RWCP. The recognition rate for about 45 data kinds of environmental sound was about 92%. The merit of this method is the use of a one-dimensional input which combines the power pattern and the instantaneous spectrum of sound data. Comparing with the method using only instantaneous spectrum, the new method are sufficient for larger sound database and the recognition rate was increased about 12%. The results are also comparable with the methods of HMM, while those methods require 2-dimensional spectrum time series data and more complicated computation.","PeriodicalId":126169,"journal":{"name":"2011 3rd International Conference on Awareness Science and Technology (iCAST)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 3rd International Conference on Awareness Science and Technology (iCAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2012/650818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Environmental sound recognition is an important function of robots and intelligent computer systems. In this research, we tried to use a multi-stage perceptron type neural network system for environmental sound recognition. The input data is the one-dimensional combination of instantaneous spectrum at power peak and the power pattern in time domain. Since for almost environmental sounds, their spectrum changes are not remarkable compared with speech or voice, the combination of power and frequency pattern will preserve the major features of environmental sounds but with drastically reduced data. Two experiments were conducted using an original database and a database created by the RWCP. The recognition rate for about 45 data kinds of environmental sound was about 92%. The merit of this method is the use of a one-dimensional input which combines the power pattern and the instantaneous spectrum of sound data. Comparing with the method using only instantaneous spectrum, the new method are sufficient for larger sound database and the recognition rate was increased about 12%. The results are also comparable with the methods of HMM, while those methods require 2-dimensional spectrum time series data and more complicated computation.
使用时频交叉模式的环境声音识别
环境声音识别是机器人和智能计算机系统的一项重要功能。在本研究中,我们尝试使用多阶段感知器型神经网络系统进行环境声音识别。输入数据是功率峰值瞬时谱和时域功率图的一维组合。因为对于几乎环境声音来说,它们的频谱变化与语音或声音相比并不显着,所以功率和频率模式的组合将保留环境声音的主要特征,但会大大减少数据。使用原始数据库和RWCP创建的数据库进行了两个实验。对约45种数据类型的环境声音的识别率约为92%。该方法的优点是使用了一维输入,结合了功率模式和声音数据的瞬时频谱。与仅使用瞬时谱的方法相比,该方法可以满足较大的声音库,识别率提高了12%左右。结果与隐马尔可夫方法具有可比性,但隐马尔可夫方法需要二维谱时间序列数据,计算更复杂。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信