{"title":"视听整合学习","authors":"Rujiao Yan, Tobias Rodemann, B. Wrede","doi":"10.1109/DEVLRN.2011.6037323","DOIUrl":null,"url":null,"abstract":"We present a system for learning audiovisual integration based on temporal and spatial coincidence. The current sound is sometimes related to a visual signal that has not yet been seen, we consider this situation as well. Our learning algorithm is tested in online adaptation of audio-motor maps. Since audio-motor maps are not reliable at the beginning of the experiment, learning is bootstrapped using temporal coincidence when there is only one auditory and one visual stimulus. In the course of time, the system can automatically decide to use both spatial and temporal coincidence depending on the quality of maps and the number of visual sources. We can show that this audio-visual integration can work when more than one visual source appears. The integration performance does not decrease when the related visual source has not yet been spotted. The experiment is executed on a humanoid robot head.","PeriodicalId":256921,"journal":{"name":"2011 IEEE International Conference on Development and Learning (ICDL)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Learning of audiovisual integration\",\"authors\":\"Rujiao Yan, Tobias Rodemann, B. Wrede\",\"doi\":\"10.1109/DEVLRN.2011.6037323\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a system for learning audiovisual integration based on temporal and spatial coincidence. The current sound is sometimes related to a visual signal that has not yet been seen, we consider this situation as well. Our learning algorithm is tested in online adaptation of audio-motor maps. Since audio-motor maps are not reliable at the beginning of the experiment, learning is bootstrapped using temporal coincidence when there is only one auditory and one visual stimulus. In the course of time, the system can automatically decide to use both spatial and temporal coincidence depending on the quality of maps and the number of visual sources. We can show that this audio-visual integration can work when more than one visual source appears. The integration performance does not decrease when the related visual source has not yet been spotted. The experiment is executed on a humanoid robot head.\",\"PeriodicalId\":256921,\"journal\":{\"name\":\"2011 IEEE International Conference on Development and Learning (ICDL)\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE International Conference on Development and Learning (ICDL)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DEVLRN.2011.6037323\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Development and Learning (ICDL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEVLRN.2011.6037323","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
We present a system for learning audiovisual integration based on temporal and spatial coincidence. The current sound is sometimes related to a visual signal that has not yet been seen, we consider this situation as well. Our learning algorithm is tested in online adaptation of audio-motor maps. Since audio-motor maps are not reliable at the beginning of the experiment, learning is bootstrapped using temporal coincidence when there is only one auditory and one visual stimulus. In the course of time, the system can automatically decide to use both spatial and temporal coincidence depending on the quality of maps and the number of visual sources. We can show that this audio-visual integration can work when more than one visual source appears. The integration performance does not decrease when the related visual source has not yet been spotted. The experiment is executed on a humanoid robot head.