{"title":"Evaluating vowel pronunciation quality: Formant space matching versus ASR confidence scoring","authors":"Ashish Patil, Chitralekha Gupta, P. Rao","doi":"10.1109/NCC.2010.5430187","DOIUrl":null,"url":null,"abstract":"Quantitative evaluation of the quality of a speaker's pronunciation of the vowels of a language can contribute to the important task of speaker accent detection. Our aim is to qualitatively and quantitatively distinguish between native and non-native speakers of a language on the basis of a comparative study of two analysis methods. One deals with relative positions of their vowels in formant (F1-F2) space that conveys important articulatory information. The other method exploits the sensitivity of trained phone models to accent variations, as captured by the log likelihood scores, to distinguish between native and non-native speakers.","PeriodicalId":130953,"journal":{"name":"2010 National Conference On Communications (NCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 National Conference On Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2010.5430187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Quantitative evaluation of the quality of a speaker's pronunciation of the vowels of a language can contribute to the important task of speaker accent detection. Our aim is to qualitatively and quantitatively distinguish between native and non-native speakers of a language on the basis of a comparative study of two analysis methods. One deals with relative positions of their vowels in formant (F1-F2) space that conveys important articulatory information. The other method exploits the sensitivity of trained phone models to accent variations, as captured by the log likelihood scores, to distinguish between native and non-native speakers.