{"title":"通过参数化建模和基于合成分析的模式匹配提取音调特征","authors":"Jinfu Ni, H. Kawai","doi":"10.1109/ICASSP.2003.1198719","DOIUrl":null,"url":null,"abstract":"A functional fundamental frequency (F/sub 0/) model is applied to extract tone peak and gliding features from Mandarin F/sub 0/ contours aiming at automatic prosodic labeling of a large scale speech corpus. Modeling four lexical tones and representing them in a parametric form based on the F/sub 0/ model, we first cluster baseline tone patterns using the LBG (Linde-Buzo-Gray) algorithm, then perform analysis-by-synthesis-based pattern matching to estimate underlying tone peaks and tone pattern types from observed F/sub 0/ contours and phonetic labels with lexical tones. Tone gliding features are re-estimated after the determination of tone peaks. 94% of the automatically estimated labels were consistent with the manual labels in an open test of 968 utterances from eight native speakers. Also, experimental results indicate that the proposed method is applicable for F/sub 0/ contour smoothing and tone verification.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"433 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Tone feature extraction through parametric modeling and analysis-by-synthesis-based pattern matching\",\"authors\":\"Jinfu Ni, H. Kawai\",\"doi\":\"10.1109/ICASSP.2003.1198719\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A functional fundamental frequency (F/sub 0/) model is applied to extract tone peak and gliding features from Mandarin F/sub 0/ contours aiming at automatic prosodic labeling of a large scale speech corpus. Modeling four lexical tones and representing them in a parametric form based on the F/sub 0/ model, we first cluster baseline tone patterns using the LBG (Linde-Buzo-Gray) algorithm, then perform analysis-by-synthesis-based pattern matching to estimate underlying tone peaks and tone pattern types from observed F/sub 0/ contours and phonetic labels with lexical tones. Tone gliding features are re-estimated after the determination of tone peaks. 94% of the automatically estimated labels were consistent with the manual labels in an open test of 968 utterances from eight native speakers. Also, experimental results indicate that the proposed method is applicable for F/sub 0/ contour smoothing and tone verification.\",\"PeriodicalId\":104473,\"journal\":{\"name\":\"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).\",\"volume\":\"433 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2003.1198719\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2003.1198719","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tone feature extraction through parametric modeling and analysis-by-synthesis-based pattern matching
A functional fundamental frequency (F/sub 0/) model is applied to extract tone peak and gliding features from Mandarin F/sub 0/ contours aiming at automatic prosodic labeling of a large scale speech corpus. Modeling four lexical tones and representing them in a parametric form based on the F/sub 0/ model, we first cluster baseline tone patterns using the LBG (Linde-Buzo-Gray) algorithm, then perform analysis-by-synthesis-based pattern matching to estimate underlying tone peaks and tone pattern types from observed F/sub 0/ contours and phonetic labels with lexical tones. Tone gliding features are re-estimated after the determination of tone peaks. 94% of the automatically estimated labels were consistent with the manual labels in an open test of 968 utterances from eight native speakers. Also, experimental results indicate that the proposed method is applicable for F/sub 0/ contour smoothing and tone verification.