A tissue-conductive acoustic sensor applied in speech recognition for privacy

P. Heracleous, Y. Nakajima, H. Saruwatari, K. Shikano
{"title":"A tissue-conductive acoustic sensor applied in speech recognition for privacy","authors":"P. Heracleous, Y. Nakajima, H. Saruwatari, K. Shikano","doi":"10.1145/1107548.1107577","DOIUrl":null,"url":null,"abstract":"In this paper, we present the Non-Audible Murmur (NAM) microphones focusing on their applications in automatic speech recognition. A NAM microphone is a special acoustic sensor attached behind the talker's ear and able to capture very quietly uttered speech (non-audible murmur) through body tissue. Previously, we reported experimental results for non-audible murmur recognition using a Stethoscope microphone in a clean environment. In this paper, we also present a more advanced NAM microphone, the so-called Silicon NAM microphone. Using a small amount of training data and adaptation approaches, we achieved a 93.9% word accuracy for a 20k vocabulary dictation task. Therefore, in situations when privacy in human-machine communication is preferable, NAM microphone can be very effectively applied for automatic recognition of speech inaudible to other listeners near the talker. Because of the nature of non-audible murmur (e.g., privacy) investigation of the behavior of NAM microphones in noisy environments is of high importance. To do this, we also conducted experiments in real and simulated noisy environments. Although, using simulated noisy data the NAM microphones show high robustness against noise, in real environments the recognition performance decreases markedly due to the effect of the Lombard reflex. In this paper, we also report experimental results showing the negative impact effect of the Lombard reflex on non-audible murmur recognition. In addition to a dictation task, we also report a keyword-spotting system based on non-audible murmur with very promising results.","PeriodicalId":391548,"journal":{"name":"sOc-EUSAI '05","volume":"126 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"sOc-EUSAI '05","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1107548.1107577","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

In this paper, we present the Non-Audible Murmur (NAM) microphones focusing on their applications in automatic speech recognition. A NAM microphone is a special acoustic sensor attached behind the talker's ear and able to capture very quietly uttered speech (non-audible murmur) through body tissue. Previously, we reported experimental results for non-audible murmur recognition using a Stethoscope microphone in a clean environment. In this paper, we also present a more advanced NAM microphone, the so-called Silicon NAM microphone. Using a small amount of training data and adaptation approaches, we achieved a 93.9% word accuracy for a 20k vocabulary dictation task. Therefore, in situations when privacy in human-machine communication is preferable, NAM microphone can be very effectively applied for automatic recognition of speech inaudible to other listeners near the talker. Because of the nature of non-audible murmur (e.g., privacy) investigation of the behavior of NAM microphones in noisy environments is of high importance. To do this, we also conducted experiments in real and simulated noisy environments. Although, using simulated noisy data the NAM microphones show high robustness against noise, in real environments the recognition performance decreases markedly due to the effect of the Lombard reflex. In this paper, we also report experimental results showing the negative impact effect of the Lombard reflex on non-audible murmur recognition. In addition to a dictation task, we also report a keyword-spotting system based on non-audible murmur with very promising results.
一种用于隐私语音识别的组织传导声学传感器
本文介绍了非可听杂音(NAM)麦克风,重点介绍了其在自动语音识别中的应用。非声纳麦克风是一种特殊的声学传感器,安装在说话人的耳朵后面,能够通过身体组织捕捉非常安静的语音(听不到的杂音)。以前,我们报道了在清洁环境中使用听诊器麦克风识别不可听杂音的实验结果。在本文中,我们还提出了一种更先进的非运动麦克风,即所谓的硅非运动麦克风。使用少量的训练数据和自适应方法,我们在一个20k词汇的听写任务中实现了93.9%的单词准确率。因此,在人机通信的隐私性较好的情况下,可以非常有效地应用于对说话者附近的其他听众听不到的语音进行自动识别。由于不可听杂音的性质(如隐私),研究非声纳麦克风在嘈杂环境中的行为是非常重要的。为此,我们还在真实和模拟的嘈杂环境中进行了实验。尽管使用模拟噪声数据时,NAM麦克风对噪声具有较高的鲁棒性,但在真实环境中,由于伦巴第反射的影响,识别性能明显下降。在本文中,我们也报道了伦巴第反射对非听杂音识别的负面影响的实验结果。除了听写任务,我们还报告了一个基于不可听杂音的关键字识别系统,结果非常有希望。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信