{"title":"Hardware and algorithms for ultrasonic depth imaging","authors":"Ivan Dokmanić, I. Tashev","doi":"10.1109/ICASSP.2014.6854897","DOIUrl":null,"url":null,"abstract":"Depth imaging is commonly based on light. For example, LIDAR and Kinect use infrared light, while stereo cameras use visible light. These systems require hardware operating at high sampling frequencies, precise calibration, and they dissipate significant power. In this paper, we investigate the potential of ultrasound for image and depth acquisition, with applications to human-computer interaction and skeletal tracking in mind. We use a loudspeaker array and a microphone array to sense the scene. We discuss a technique for offline loudspeaker beamforming (commonly used for microphone beamforming) which enables us to significantly increase the frame rate. Further, we propose a sound-source-localization-based method for computing the depth image, giving a substantial improvement over the näıve time-of-flight approach. We designed inexpensive hardware with eight elements per array to obtain both the depth and the intensity images. Even with this limited number of transducers we obtain promising experimental results.","PeriodicalId":6545,"journal":{"name":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"19 1","pages":"6702-6706"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2014.6854897","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
Depth imaging is commonly based on light. For example, LIDAR and Kinect use infrared light, while stereo cameras use visible light. These systems require hardware operating at high sampling frequencies, precise calibration, and they dissipate significant power. In this paper, we investigate the potential of ultrasound for image and depth acquisition, with applications to human-computer interaction and skeletal tracking in mind. We use a loudspeaker array and a microphone array to sense the scene. We discuss a technique for offline loudspeaker beamforming (commonly used for microphone beamforming) which enables us to significantly increase the frame rate. Further, we propose a sound-source-localization-based method for computing the depth image, giving a substantial improvement over the näıve time-of-flight approach. We designed inexpensive hardware with eight elements per array to obtain both the depth and the intensity images. Even with this limited number of transducers we obtain promising experimental results.