{"title":"WISS,用于移动机器人的扬声器识别系统","authors":"François Grondin, F. Michaud","doi":"10.1109/ICRA.2012.6224729","DOIUrl":null,"url":null,"abstract":"This paper presents WISS, a speaker identification system for mobile robots integrated to ManyEars, a sound source localization, tracking and separation system. Speaker identification consists in recognizing an individual among a group of known speakers. For mobile robots, performing speaker identification in presence of noise that changes over time is one important challenge. To deal with this issue, WISS uses Parallel Model Combination (PMC) and masks to update in real-time the speaker models (obtained in clean conditions) to both additive and convolutive noises. The results show that the weighted rate of good speaker identifications is 96% on average for a Signal-to-Noise Ratio (SNR) of 16 dB, whereas it only decreases to 84% when the SNR drops to 2 dB.","PeriodicalId":246173,"journal":{"name":"2012 IEEE International Conference on Robotics and Automation","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"WISS, a speaker identification system for mobile robots\",\"authors\":\"François Grondin, F. Michaud\",\"doi\":\"10.1109/ICRA.2012.6224729\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents WISS, a speaker identification system for mobile robots integrated to ManyEars, a sound source localization, tracking and separation system. Speaker identification consists in recognizing an individual among a group of known speakers. For mobile robots, performing speaker identification in presence of noise that changes over time is one important challenge. To deal with this issue, WISS uses Parallel Model Combination (PMC) and masks to update in real-time the speaker models (obtained in clean conditions) to both additive and convolutive noises. The results show that the weighted rate of good speaker identifications is 96% on average for a Signal-to-Noise Ratio (SNR) of 16 dB, whereas it only decreases to 84% when the SNR drops to 2 dB.\",\"PeriodicalId\":246173,\"journal\":{\"name\":\"2012 IEEE International Conference on Robotics and Automation\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Robotics and Automation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRA.2012.6224729\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Robotics and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA.2012.6224729","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
WISS, a speaker identification system for mobile robots
This paper presents WISS, a speaker identification system for mobile robots integrated to ManyEars, a sound source localization, tracking and separation system. Speaker identification consists in recognizing an individual among a group of known speakers. For mobile robots, performing speaker identification in presence of noise that changes over time is one important challenge. To deal with this issue, WISS uses Parallel Model Combination (PMC) and masks to update in real-time the speaker models (obtained in clean conditions) to both additive and convolutive noises. The results show that the weighted rate of good speaker identifications is 96% on average for a Signal-to-Noise Ratio (SNR) of 16 dB, whereas it only decreases to 84% when the SNR drops to 2 dB.