{"title":"Learning an intelligibility map of individual utterances","authors":"Michael I. Mandel","doi":"10.1109/WASPAA.2013.6701835","DOIUrl":null,"url":null,"abstract":"Predicting the intelligibility of noisy recordings is difficult and most current algorithms only aim to be correct on average across many recordings. This paper describes a listening test paradigm and associated analysis technique that can predict the intelligibility of a specific recording of a word in the presence of a specific noise instance. The analysis learns a map of the importance of each point in the recording's spectrogram to the overall intelligibility of the word when glimpsed through “bubbles” in many noise instances. By treating this as a classification problem, a linear classifier can be used to predict intelligibility and can be examined to determine the importance of spectral regions. This approach was tested on recordings of vowels and consonants. The important regions identified by the model in these tests agreed with those identified by a standard, non-predictive statistical test of independence and with the acoustic phonetics literature.","PeriodicalId":341888,"journal":{"name":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WASPAA.2013.6701835","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Predicting the intelligibility of noisy recordings is difficult and most current algorithms only aim to be correct on average across many recordings. This paper describes a listening test paradigm and associated analysis technique that can predict the intelligibility of a specific recording of a word in the presence of a specific noise instance. The analysis learns a map of the importance of each point in the recording's spectrogram to the overall intelligibility of the word when glimpsed through “bubbles” in many noise instances. By treating this as a classification problem, a linear classifier can be used to predict intelligibility and can be examined to determine the importance of spectral regions. This approach was tested on recordings of vowels and consonants. The important regions identified by the model in these tests agreed with those identified by a standard, non-predictive statistical test of independence and with the acoustic phonetics literature.