{"title":"A Free-Source Method (FrSM) for Calibrating a Large-Aperture Microphone Array","authors":"Sarthak Khanal, H. Silverman, Rahul R. Shakya","doi":"10.1109/TASL.2013.2256896","DOIUrl":null,"url":null,"abstract":"Large-aperture microphone arrays can be used to capture and enhance speech from individual talkers in noisy, multi-talker, and reverberant environments. However, they must be calibrated, often more than once, to obtain accurate 3-dimensional coordinates for all microphones. Direct-measurement techniques, such as using a measuring tape or a laser-based tool are cumbersome and time-consuming. Some previous methods that used acoustic signals for array calibration required bulky hardware and/or fixed, known source locations. Others, which allowed more flexible source placement, often have issues with real data, have reported results for 2D only, or work only for small arrays. This paper describes a complete and robust method for automatic calibration using acoustic signals which is simple, repeatable, accurate, and has been shown to work for a real system. The method requires only a single transducer (speaker) with a microphone attached above its center. The unit is freely moved around the focal volume of the microphone array generating a single long recording from all the microphones. After that, the system is completely automatic. We describe the free source method (FrSM), validate its effectiveness and present accuracy results against measured ground truth. The performance of FrSM is compared to that from several other methods for a real 128-microphone array.","PeriodicalId":55014,"journal":{"name":"IEEE Transactions on Audio Speech and Language Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TASL.2013.2256896","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Audio Speech and Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TASL.2013.2256896","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Large-aperture microphone arrays can be used to capture and enhance speech from individual talkers in noisy, multi-talker, and reverberant environments. However, they must be calibrated, often more than once, to obtain accurate 3-dimensional coordinates for all microphones. Direct-measurement techniques, such as using a measuring tape or a laser-based tool are cumbersome and time-consuming. Some previous methods that used acoustic signals for array calibration required bulky hardware and/or fixed, known source locations. Others, which allowed more flexible source placement, often have issues with real data, have reported results for 2D only, or work only for small arrays. This paper describes a complete and robust method for automatic calibration using acoustic signals which is simple, repeatable, accurate, and has been shown to work for a real system. The method requires only a single transducer (speaker) with a microphone attached above its center. The unit is freely moved around the focal volume of the microphone array generating a single long recording from all the microphones. After that, the system is completely automatic. We describe the free source method (FrSM), validate its effectiveness and present accuracy results against measured ground truth. The performance of FrSM is compared to that from several other methods for a real 128-microphone array.
期刊介绍:
The IEEE Transactions on Audio, Speech and Language Processing covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language. In particular, audio processing also covers auditory modeling, acoustic modeling and source separation. Speech processing also covers speech production and perception, adaptation, lexical modeling and speaker recognition. Language processing also covers spoken language understanding, translation, summarization, mining, general language modeling, as well as spoken dialog systems.