Robust voice activity detection for social sensing

S. Feese, G. Tröster
{"title":"Robust voice activity detection for social sensing","authors":"S. Feese, G. Tröster","doi":"10.1145/2494091.2497347","DOIUrl":null,"url":null,"abstract":"The speech modality is a rich source of personal information. As such, speech detection is a fundamental function of many social sensing applications. Simply the amount of speech present in our surroundings can give indications about our socialbility and communication patterns. In this work, we present and evaluate a speech detection approach utilizing dictionary learning and sparse signal representation. Transforming the noisy audio data to the sparse representation with a dictionary learned from clean speech data, we show that speech and non speech can be discriminated even in low signal-to-noise conditions with up to 92% accuracy. In addition to an evaluation with simulated data, we evaluate the algorithm on a real-world data set recorded during firefighting missions. We show, that speech activity of firefighters can be detected with 85% accuracy when using a smartphone that was placed in the firefighting jacket.","PeriodicalId":220524,"journal":{"name":"Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2494091.2497347","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The speech modality is a rich source of personal information. As such, speech detection is a fundamental function of many social sensing applications. Simply the amount of speech present in our surroundings can give indications about our socialbility and communication patterns. In this work, we present and evaluate a speech detection approach utilizing dictionary learning and sparse signal representation. Transforming the noisy audio data to the sparse representation with a dictionary learned from clean speech data, we show that speech and non speech can be discriminated even in low signal-to-noise conditions with up to 92% accuracy. In addition to an evaluation with simulated data, we evaluate the algorithm on a real-world data set recorded during firefighting missions. We show, that speech activity of firefighters can be detected with 85% accuracy when using a smartphone that was placed in the firefighting jacket.
基于社会感知的鲁棒语音活动检测
言语语态是个人信息的丰富来源。因此,语音检测是许多社会传感应用的基本功能。仅仅是我们周围的言语量就能显示出我们的社交能力和沟通模式。在这项工作中,我们提出并评估了一种利用字典学习和稀疏信号表示的语音检测方法。利用从干净语音数据中学习到的字典将噪声音频数据转换为稀疏表示,我们表明即使在低信噪比条件下也可以区分语音和非语音,准确率高达92%。除了用模拟数据进行评估外,我们还在消防任务期间记录的真实数据集上评估了该算法。我们发现,当使用放在消防夹克里的智能手机时,消防员的语音活动可以被检测到85%的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信