An open source software system for robot audition HARK and its evaluation

K. Nakadai, HIroshi G. Okuno, H. Nakajima, Yuji Hasegawa, H. Tsujino
{"title":"An open source software system for robot audition HARK and its evaluation","authors":"K. Nakadai, HIroshi G. Okuno, H. Nakajima, Yuji Hasegawa, H. Tsujino","doi":"10.1109/ICHR.2008.4756031","DOIUrl":null,"url":null,"abstract":"Robot capability of listening to several things at once by its own ears, that is, robot audition, is important in improving human-robot interaction. The critical issue in robot audition is real-time processing in noisy environments with high flexibility to support various kinds of robots and hardware configurations. This paper presents open-source robot audition software, called ldquoHARKrdquo, which includes sound source localization, separation, and automatic speech recognition (ASR). Since separated sounds suffer from spectral distortion due to separation, HARK generates a temporal-frequency map of reliability, called ldquomissing feature maskrdquo, for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. HARK is implemented on the middleware called ldquoFlowDesignerrdquo to share intermediate audio data, which provides real-time processing. HARKpsilas performance in recognition of noisy/simultaneous speech is shown by using three humanoid robots, Honda ASIMO, SIG2 and Robovie with different microphone layouts.","PeriodicalId":402020,"journal":{"name":"Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"72","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICHR.2008.4756031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 72

Abstract

Robot capability of listening to several things at once by its own ears, that is, robot audition, is important in improving human-robot interaction. The critical issue in robot audition is real-time processing in noisy environments with high flexibility to support various kinds of robots and hardware configurations. This paper presents open-source robot audition software, called ldquoHARKrdquo, which includes sound source localization, separation, and automatic speech recognition (ASR). Since separated sounds suffer from spectral distortion due to separation, HARK generates a temporal-frequency map of reliability, called ldquomissing feature maskrdquo, for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. HARK is implemented on the middleware called ldquoFlowDesignerrdquo to share intermediate audio data, which provides real-time processing. HARKpsilas performance in recognition of noisy/simultaneous speech is shown by using three humanoid robots, Honda ASIMO, SIG2 and Robovie with different microphone layouts.
机器人试听系统HARK的开源软件系统及其评价
机器人用自己的耳朵同时听到几件事情的能力,即机器人听音能力,对提高人机交互具有重要意义。机器人听力的关键问题是在嘈杂环境下的实时处理,并具有高度的灵活性,以支持各种机器人和硬件配置。本文介绍了一种名为ldquoHARKrdquo的开源机器人试听软件,该软件包括声源定位、分离和自动语音识别(ASR)。由于分离后的声音会因分离而产生频谱失真,因此HARK会为分离后的声音特征生成一个时间-频率可靠性图,称为ldquomissing feature maskdquo。然后利用缺失特征掩模,利用缺失特征理论(MFT)对分离的声音进行识别。HARK在名为ldquoFlowDesignerrdquo的中间件上实现,以共享中间音频数据,从而提供实时处理。采用本田ASIMO、SIG2和Robovie三种具有不同麦克风布局的人形机器人,展示了HARKpsilas对噪声/同步语音的识别性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信