Changkyu Choi, Donggeon Kong, Sujin Lee, Kiyoung Park, Sun-Gi Hong, Hyoung-Ki Lee, S. Bang, Yongbeom Lee, Sang Ryong Kim
{"title":"Real-time audio-visual localization of user using microphone array and vision camera","authors":"Changkyu Choi, Donggeon Kong, Sujin Lee, Kiyoung Park, Sun-Gi Hong, Hyoung-Ki Lee, S. Bang, Yongbeom Lee, Sang Ryong Kim","doi":"10.1109/IROS.2005.1545030","DOIUrl":null,"url":null,"abstract":"In home environments, demands for a robot to serve a user are on the increase, such as cleaning rooms, bringing something to the user, and so on. To achieve these tasks, it is essential for developing a natural way of human-robot interaction (HRI). One of the most natural ways is that the robot approaches the user to do some tasks after recognizing the user's call and localizing its position. In this case, user localization becomes a key technology. In this paper, we propose a novel audio visual user localization system. It consists of a microphone array with eight sensors and a video camera. Estimating calling direction is achieved by the spectral subtraction of the spatial spectra. In particular, a novel beam forming method is proposed to suppress the nonstationary audio noises where they always occur in a real world. Furthermore, a robust method for face detection is proposed to double check the user based on an Adaboost classifier. It is improved to reduce the false alarms remarkably through a new postprocessing on face candidates. Successful results in a real home environment show its efficacy and feasibility. The implementation issues, limitations, and their possible solutions are also discussed.","PeriodicalId":189219,"journal":{"name":"2005 IEEE/RSJ International Conference on Intelligent Robots and Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE/RSJ International Conference on Intelligent Robots and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IROS.2005.1545030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
In home environments, demands for a robot to serve a user are on the increase, such as cleaning rooms, bringing something to the user, and so on. To achieve these tasks, it is essential for developing a natural way of human-robot interaction (HRI). One of the most natural ways is that the robot approaches the user to do some tasks after recognizing the user's call and localizing its position. In this case, user localization becomes a key technology. In this paper, we propose a novel audio visual user localization system. It consists of a microphone array with eight sensors and a video camera. Estimating calling direction is achieved by the spectral subtraction of the spatial spectra. In particular, a novel beam forming method is proposed to suppress the nonstationary audio noises where they always occur in a real world. Furthermore, a robust method for face detection is proposed to double check the user based on an Adaboost classifier. It is improved to reduce the false alarms remarkably through a new postprocessing on face candidates. Successful results in a real home environment show its efficacy and feasibility. The implementation issues, limitations, and their possible solutions are also discussed.