Closed-loop sound source localization in neuromorphic systems

Neuromorphic Computing and Engineering Pub Date : 2023-06-01 DOI:10.1088/2634-4386/acdaba

Thorben Schoepe, Daniel Gutierrez-Galan, J. P. Dominguez-Morales, Hugh Greatorex, Angel Francisco Jiménez Fernández, A. Linares-Barranco, E. Chicca

{"title":"Closed-loop sound source localization in neuromorphic systems","authors":"Thorben Schoepe, Daniel Gutierrez-Galan, J. P. Dominguez-Morales, Hugh Greatorex, Angel Francisco Jiménez Fernández, A. Linares-Barranco, E. Chicca","doi":"10.1088/2634-4386/acdaba","DOIUrl":null,"url":null,"abstract":"Sound source localization (SSL) is used in various applications such as industrial noise-control, speech detection in mobile phones, speech enhancement in hearing aids and many more. Newest video conferencing setups use SSL. The position of a speaker is detected from the difference in the audio waves received by a microphone array. After detection the camera focuses onto the location of the speaker. The human brain is also able to detect the location of a speaker from auditory signals. It uses, among other cues, the difference in amplitude and arrival time of the sound wave at the two ears, called interaural level and time difference. However, the substrate and computational primitives of our brain are different from classical digital computing. Due to its low power consumption of around 20 W and its performance in real time the human brain has become a great source of inspiration for emerging technologies. One of these technologies is neuromorphic hardware which implements the fundamental principles of brain computing identified until today using complementary metal-oxide-semiconductor technologies and new devices. In this work we propose the first neuromorphic closed-loop robotic system that uses the interaural time difference for SSL in real time. Our system can successfully locate sound sources such as human speech. In a closed-loop experiment, the robotic platform turned immediately into the direction of the sound source with a turning velocity linearly proportional to the angle difference between sound source and binaural microphones. After this initial turn, the robotic platform remains at the direction of the sound source. Even though the system only uses very few resources of the available hardware, consumes around 1 W, and was only tuned by hand, meaning it does not contain any learning at all, it already reaches performances comparable to other neuromorphic approaches. The SSL system presented in this article brings us one step closer towards neuromorphic event-based systems for robotics and embodied computing.","PeriodicalId":198030,"journal":{"name":"Neuromorphic Computing and Engineering","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neuromorphic Computing and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2634-4386/acdaba","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Sound source localization (SSL) is used in various applications such as industrial noise-control, speech detection in mobile phones, speech enhancement in hearing aids and many more. Newest video conferencing setups use SSL. The position of a speaker is detected from the difference in the audio waves received by a microphone array. After detection the camera focuses onto the location of the speaker. The human brain is also able to detect the location of a speaker from auditory signals. It uses, among other cues, the difference in amplitude and arrival time of the sound wave at the two ears, called interaural level and time difference. However, the substrate and computational primitives of our brain are different from classical digital computing. Due to its low power consumption of around 20 W and its performance in real time the human brain has become a great source of inspiration for emerging technologies. One of these technologies is neuromorphic hardware which implements the fundamental principles of brain computing identified until today using complementary metal-oxide-semiconductor technologies and new devices. In this work we propose the first neuromorphic closed-loop robotic system that uses the interaural time difference for SSL in real time. Our system can successfully locate sound sources such as human speech. In a closed-loop experiment, the robotic platform turned immediately into the direction of the sound source with a turning velocity linearly proportional to the angle difference between sound source and binaural microphones. After this initial turn, the robotic platform remains at the direction of the sound source. Even though the system only uses very few resources of the available hardware, consumes around 1 W, and was only tuned by hand, meaning it does not contain any learning at all, it already reaches performances comparable to other neuromorphic approaches. The SSL system presented in this article brings us one step closer towards neuromorphic event-based systems for robotics and embodied computing.

查看原文本刊更多论文

神经形态系统闭环声源定位

声源定位(SSL)用于各种应用，如工业噪声控制，手机语音检测，助听器语音增强等等。最新的视频会议设置使用SSL。扬声器的位置是从麦克风阵列接收到的音频波的差异中检测出来的。检测后，摄像机聚焦到扬声器的位置。人类的大脑也能够从听觉信号中探测到说话人的位置。除了其他线索外，它还利用声波在两耳中的振幅和到达时间的差异，称为耳间电平和时间差。然而，我们大脑的基础和计算基元不同于经典的数字计算。由于其20瓦左右的低功耗和实时性能，人脑已成为新兴技术的巨大灵感来源。其中一项技术是神经形态硬件，它实现了迄今为止使用互补金属氧化物半导体技术和新设备确定的大脑计算的基本原理。在这项工作中，我们提出了第一个神经形态闭环机器人系统，该系统实时使用耳间时差进行SSL。我们的系统可以成功地定位声源，比如人类的语言。在闭环实验中，机器人平台立即转向声源方向，转向速度与声源与双耳传声器的角度差成正比。在这个初始转弯之后，机器人平台保持在声源的方向。尽管该系统只使用很少的可用硬件资源，消耗大约1w，并且只进行了手动调优，这意味着它根本不包含任何学习，但它已经达到了与其他神经形态方法相当的性能。本文中介绍的SSL系统使我们向机器人和嵌入式计算的基于事件的神经形态系统迈进了一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neuromorphic Computing and Engineering

CiteScore

5.90

自引率

0.00%

发文量