机器人试听系统HARK的开源软件系统及其评价

Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots Pub Date : 2008-12-01 DOI:10.1109/ICHR.2008.4756031

K. Nakadai, HIroshi G. Okuno, H. Nakajima, Yuji Hasegawa, H. Tsujino

{"title":"机器人试听系统HARK的开源软件系统及其评价","authors":"K. Nakadai, HIroshi G. Okuno, H. Nakajima, Yuji Hasegawa, H. Tsujino","doi":"10.1109/ICHR.2008.4756031","DOIUrl":null,"url":null,"abstract":"Robot capability of listening to several things at once by its own ears, that is, robot audition, is important in improving human-robot interaction. The critical issue in robot audition is real-time processing in noisy environments with high flexibility to support various kinds of robots and hardware configurations. This paper presents open-source robot audition software, called ldquoHARKrdquo, which includes sound source localization, separation, and automatic speech recognition (ASR). Since separated sounds suffer from spectral distortion due to separation, HARK generates a temporal-frequency map of reliability, called ldquomissing feature maskrdquo, for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. HARK is implemented on the middleware called ldquoFlowDesignerrdquo to share intermediate audio data, which provides real-time processing. HARKpsilas performance in recognition of noisy/simultaneous speech is shown by using three humanoid robots, Honda ASIMO, SIG2 and Robovie with different microphone layouts.","PeriodicalId":402020,"journal":{"name":"Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"72","resultStr":"{\"title\":\"An open source software system for robot audition HARK and its evaluation\",\"authors\":\"K. Nakadai, HIroshi G. Okuno, H. Nakajima, Yuji Hasegawa, H. Tsujino\",\"doi\":\"10.1109/ICHR.2008.4756031\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Robot capability of listening to several things at once by its own ears, that is, robot audition, is important in improving human-robot interaction. The critical issue in robot audition is real-time processing in noisy environments with high flexibility to support various kinds of robots and hardware configurations. This paper presents open-source robot audition software, called ldquoHARKrdquo, which includes sound source localization, separation, and automatic speech recognition (ASR). Since separated sounds suffer from spectral distortion due to separation, HARK generates a temporal-frequency map of reliability, called ldquomissing feature maskrdquo, for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. HARK is implemented on the middleware called ldquoFlowDesignerrdquo to share intermediate audio data, which provides real-time processing. HARKpsilas performance in recognition of noisy/simultaneous speech is shown by using three humanoid robots, Honda ASIMO, SIG2 and Robovie with different microphone layouts.\",\"PeriodicalId\":402020,\"journal\":{\"name\":\"Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"72\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICHR.2008.4756031\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICHR.2008.4756031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 72

摘要

机器人用自己的耳朵同时听到几件事情的能力，即机器人听音能力，对提高人机交互具有重要意义。机器人听力的关键问题是在嘈杂环境下的实时处理，并具有高度的灵活性，以支持各种机器人和硬件配置。本文介绍了一种名为ldquoHARKrdquo的开源机器人试听软件，该软件包括声源定位、分离和自动语音识别(ASR)。由于分离后的声音会因分离而产生频谱失真，因此HARK会为分离后的声音特征生成一个时间-频率可靠性图，称为ldquomissing feature maskdquo。然后利用缺失特征掩模，利用缺失特征理论(MFT)对分离的声音进行识别。HARK在名为ldquoFlowDesignerrdquo的中间件上实现，以共享中间音频数据，从而提供实时处理。采用本田ASIMO、SIG2和Robovie三种具有不同麦克风布局的人形机器人，展示了HARKpsilas对噪声/同步语音的识别性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An open source software system for robot audition HARK and its evaluation

Robot capability of listening to several things at once by its own ears, that is, robot audition, is important in improving human-robot interaction. The critical issue in robot audition is real-time processing in noisy environments with high flexibility to support various kinds of robots and hardware configurations. This paper presents open-source robot audition software, called ldquoHARKrdquo, which includes sound source localization, separation, and automatic speech recognition (ASR). Since separated sounds suffer from spectral distortion due to separation, HARK generates a temporal-frequency map of reliability, called ldquomissing feature maskrdquo, for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. HARK is implemented on the middleware called ldquoFlowDesignerrdquo to share intermediate audio data, which provides real-time processing. HARKpsilas performance in recognition of noisy/simultaneous speech is shown by using three humanoid robots, Honda ASIMO, SIG2 and Robovie with different microphone layouts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots

自引率

0.00%

发文量