An N-gram model for unstructured audio signals toward information retrieval

Samuel Kim, Shiva Sundaram, P. Georgiou, Shrikanth S. Narayanan
{"title":"An N-gram model for unstructured audio signals toward information retrieval","authors":"Samuel Kim, Shiva Sundaram, P. Georgiou, Shrikanth S. Narayanan","doi":"10.1109/MMSP.2010.5662068","DOIUrl":null,"url":null,"abstract":"An N-gram modeling approach for unstructured audio signals is introduced with applications to audio information retrieval. The proposed N-gram approach aims to capture local dynamic information in acoustic words within the acoustic topic model framework which assumes an audio signal consists of latent acoustic topics and each topic can be interpreted as a distribution over acoustic words. Experimental results on classifying audio clips from BBC Sound Effects Library according to both semantic and onomatopoeic labels indicate that the proposed N-gram approach performs better than using only a bag-of-words approach by providing complementary local dynamic information.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Workshop on Multimedia Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP.2010.5662068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

An N-gram modeling approach for unstructured audio signals is introduced with applications to audio information retrieval. The proposed N-gram approach aims to capture local dynamic information in acoustic words within the acoustic topic model framework which assumes an audio signal consists of latent acoustic topics and each topic can be interpreted as a distribution over acoustic words. Experimental results on classifying audio clips from BBC Sound Effects Library according to both semantic and onomatopoeic labels indicate that the proposed N-gram approach performs better than using only a bag-of-words approach by providing complementary local dynamic information.
面向信息检索的非结构化音频信号n图模型
介绍了一种非结构化音频信号的n图建模方法,并将其应用于音频信息检索。提出的n图方法旨在在声学主题模型框架内捕获声学单词中的局部动态信息,该模型假设音频信号由潜在的声学主题组成,并且每个主题可以被解释为声学单词的分布。根据语义和拟声标签对BBC Sound Effects Library中的音频片段进行分类的实验结果表明,通过提供互补的局部动态信息,所提出的N-gram方法比仅使用词袋方法表现更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信