{"title":"一种基于平均语音和韵律替换的非母语者韵律质量精确评价方法","authors":"Hafiyan Prafianto, Takashi Nose, A. Ito","doi":"10.1109/ICALIP.2016.7846620","DOIUrl":null,"url":null,"abstract":"We propose a method to improve the consistency of human evaluation of non-native speaker's utterance, with a capability to evaluate features such as accent and rhythm. In this method, human evaluators evaluate the accent and the rhythm independently by using average voice model and prosody substitution. We also investigated the advantages of evaluating those features independently. We found that, when the prosodic features are not evaluated independently, the accent scores are affected by the goodness of the rhythm and vice versa. The correlation coefficient of the accent score and the rhythm score of identical utterances was 0.23 using the conventional method and −0.026 using the proposed method. This also leads to greater disagreement between the scores given by different evaluators. Using the conventional method, 23% of the pairs between evaluators have their inter-evaluator correlation of the rhythm score more than 0.5, while using this proposed method, 67% of the pairs have the inter-evaluator correlation more than 0.5.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A precise evaluation method of prosodic quality of non-native speakers using average voice and prosody substitution\",\"authors\":\"Hafiyan Prafianto, Takashi Nose, A. Ito\",\"doi\":\"10.1109/ICALIP.2016.7846620\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a method to improve the consistency of human evaluation of non-native speaker's utterance, with a capability to evaluate features such as accent and rhythm. In this method, human evaluators evaluate the accent and the rhythm independently by using average voice model and prosody substitution. We also investigated the advantages of evaluating those features independently. We found that, when the prosodic features are not evaluated independently, the accent scores are affected by the goodness of the rhythm and vice versa. The correlation coefficient of the accent score and the rhythm score of identical utterances was 0.23 using the conventional method and −0.026 using the proposed method. This also leads to greater disagreement between the scores given by different evaluators. Using the conventional method, 23% of the pairs between evaluators have their inter-evaluator correlation of the rhythm score more than 0.5, while using this proposed method, 67% of the pairs have the inter-evaluator correlation more than 0.5.\",\"PeriodicalId\":184170,\"journal\":{\"name\":\"2016 International Conference on Audio, Language and Image Processing (ICALIP)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Audio, Language and Image Processing (ICALIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICALIP.2016.7846620\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICALIP.2016.7846620","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A precise evaluation method of prosodic quality of non-native speakers using average voice and prosody substitution
We propose a method to improve the consistency of human evaluation of non-native speaker's utterance, with a capability to evaluate features such as accent and rhythm. In this method, human evaluators evaluate the accent and the rhythm independently by using average voice model and prosody substitution. We also investigated the advantages of evaluating those features independently. We found that, when the prosodic features are not evaluated independently, the accent scores are affected by the goodness of the rhythm and vice versa. The correlation coefficient of the accent score and the rhythm score of identical utterances was 0.23 using the conventional method and −0.026 using the proposed method. This also leads to greater disagreement between the scores given by different evaluators. Using the conventional method, 23% of the pairs between evaluators have their inter-evaluator correlation of the rhythm score more than 0.5, while using this proposed method, 67% of the pairs have the inter-evaluator correlation more than 0.5.