K. Songwatana, S. Sriratanapaprat, P. Kultap, K. Sittiprasert, N. Suktangman
{"title":"基于树皮尺度上LPC语音能量和频谱的3阶多项式系数识别24个泰语语音元音","authors":"K. Songwatana, S. Sriratanapaprat, P. Kultap, K. Sittiprasert, N. Suktangman","doi":"10.1109/ISWPC.2006.1613615","DOIUrl":null,"url":null,"abstract":"This paper presents a vowel recognition for Thai spoken language. The Thai language consists of 9 short unmixed vowels (a, i,ω,u, o, e, ε, γ, [unk]); 9 long unmixed vowels (aa, ii, ωω, uu, oo, ee, £ εε, γγ, [unk][unk]); 3 short mixed vowels (ia, ωa, ua); and 3 long mixed vowels (i:a:, ω:a:, u:a:). We proposed uses 3-stage decision making: step 1 distinguishes long and short vowels using coefficients of third order polynomial regression of signal energy as features set and 5-NN as classification method; step 2 classifies each voice segment (frame) into 9 basic vowels using 18 critical band intensities as feature set and 9-NN as classification method; finally step 3 decides whether each frame contains mixed or unmixed vowel via thresholding method. This solution is different from the conventional speech recognition mainly because decision making in this method is done for each frame, while conventional speech recognition chooses the best decision for a sequence of frames forming a word or a sentence. Evaluation is done by applying the algorithm to 3024 voice samples of male and female subjects. Each step of the algorithm is evaluated successively.","PeriodicalId":145728,"journal":{"name":"2006 1st International Symposium on Wireless Pervasive Computing","volume":"364 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Recognition of 24 Thai spoken Vowels Using the coefficients of 3rdOrder Polynomial Regression on the Voice Energy and Spectrum of LPC on the Bark Scale\",\"authors\":\"K. Songwatana, S. Sriratanapaprat, P. Kultap, K. Sittiprasert, N. Suktangman\",\"doi\":\"10.1109/ISWPC.2006.1613615\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a vowel recognition for Thai spoken language. The Thai language consists of 9 short unmixed vowels (a, i,ω,u, o, e, ε, γ, [unk]); 9 long unmixed vowels (aa, ii, ωω, uu, oo, ee, £ εε, γγ, [unk][unk]); 3 short mixed vowels (ia, ωa, ua); and 3 long mixed vowels (i:a:, ω:a:, u:a:). We proposed uses 3-stage decision making: step 1 distinguishes long and short vowels using coefficients of third order polynomial regression of signal energy as features set and 5-NN as classification method; step 2 classifies each voice segment (frame) into 9 basic vowels using 18 critical band intensities as feature set and 9-NN as classification method; finally step 3 decides whether each frame contains mixed or unmixed vowel via thresholding method. This solution is different from the conventional speech recognition mainly because decision making in this method is done for each frame, while conventional speech recognition chooses the best decision for a sequence of frames forming a word or a sentence. Evaluation is done by applying the algorithm to 3024 voice samples of male and female subjects. Each step of the algorithm is evaluated successively.\",\"PeriodicalId\":145728,\"journal\":{\"name\":\"2006 1st International Symposium on Wireless Pervasive Computing\",\"volume\":\"364 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 1st International Symposium on Wireless Pervasive Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISWPC.2006.1613615\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 1st International Symposium on Wireless Pervasive Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISWPC.2006.1613615","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Recognition of 24 Thai spoken Vowels Using the coefficients of 3rdOrder Polynomial Regression on the Voice Energy and Spectrum of LPC on the Bark Scale
This paper presents a vowel recognition for Thai spoken language. The Thai language consists of 9 short unmixed vowels (a, i,ω,u, o, e, ε, γ, [unk]); 9 long unmixed vowels (aa, ii, ωω, uu, oo, ee, £ εε, γγ, [unk][unk]); 3 short mixed vowels (ia, ωa, ua); and 3 long mixed vowels (i:a:, ω:a:, u:a:). We proposed uses 3-stage decision making: step 1 distinguishes long and short vowels using coefficients of third order polynomial regression of signal energy as features set and 5-NN as classification method; step 2 classifies each voice segment (frame) into 9 basic vowels using 18 critical band intensities as feature set and 9-NN as classification method; finally step 3 decides whether each frame contains mixed or unmixed vowel via thresholding method. This solution is different from the conventional speech recognition mainly because decision making in this method is done for each frame, while conventional speech recognition chooses the best decision for a sequence of frames forming a word or a sentence. Evaluation is done by applying the algorithm to 3024 voice samples of male and female subjects. Each step of the algorithm is evaluated successively.