{"title":"基于cnn的昆曲唱腔发音特征识别","authors":"Yizhi Wu, Meng-Fu Huang","doi":"10.1109/CISCE58541.2023.10142856","DOIUrl":null,"url":null,"abstract":"In order to achieve ‘Accurate pronunciation and proper melody’ in Kunqu-singing, the automatic evaluation of pronunciation based on ASR (Automatic Speech Recognition) can be greatly beneficial for learners. In modern phonetics, the articulatory feature describes the articulatory movement during speech production, which directly impact the quality and characteristics of pronunciation. To provide effective feedback for mispronunciation detection in Kunqu-singing, we propose a CNN-based articulatory feature recognition model. To tackle the issue of limited training corpus, we incorporate transfer learning into the model training by utilizing both Jingju and Kunqu corpus. The experimental results from our self-built Kunqu corpus show that the incorporation of transfer learning led to a 6% improvement in the recognition rate of articulatory feature, and the average recognition rate of various articulatory features reached 83.7%, which is 24.4% better than phoneme recognition.","PeriodicalId":145263,"journal":{"name":"2023 5th International Conference on Communications, Information System and Computer Engineering (CISCE)","volume":"102 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CNN-based Articulatory Feature Recognition for Kunqu-Singing Pronunciation Evaluation\",\"authors\":\"Yizhi Wu, Meng-Fu Huang\",\"doi\":\"10.1109/CISCE58541.2023.10142856\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to achieve ‘Accurate pronunciation and proper melody’ in Kunqu-singing, the automatic evaluation of pronunciation based on ASR (Automatic Speech Recognition) can be greatly beneficial for learners. In modern phonetics, the articulatory feature describes the articulatory movement during speech production, which directly impact the quality and characteristics of pronunciation. To provide effective feedback for mispronunciation detection in Kunqu-singing, we propose a CNN-based articulatory feature recognition model. To tackle the issue of limited training corpus, we incorporate transfer learning into the model training by utilizing both Jingju and Kunqu corpus. The experimental results from our self-built Kunqu corpus show that the incorporation of transfer learning led to a 6% improvement in the recognition rate of articulatory feature, and the average recognition rate of various articulatory features reached 83.7%, which is 24.4% better than phoneme recognition.\",\"PeriodicalId\":145263,\"journal\":{\"name\":\"2023 5th International Conference on Communications, Information System and Computer Engineering (CISCE)\",\"volume\":\"102 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 5th International Conference on Communications, Information System and Computer Engineering (CISCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISCE58541.2023.10142856\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 5th International Conference on Communications, Information System and Computer Engineering (CISCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISCE58541.2023.10142856","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CNN-based Articulatory Feature Recognition for Kunqu-Singing Pronunciation Evaluation
In order to achieve ‘Accurate pronunciation and proper melody’ in Kunqu-singing, the automatic evaluation of pronunciation based on ASR (Automatic Speech Recognition) can be greatly beneficial for learners. In modern phonetics, the articulatory feature describes the articulatory movement during speech production, which directly impact the quality and characteristics of pronunciation. To provide effective feedback for mispronunciation detection in Kunqu-singing, we propose a CNN-based articulatory feature recognition model. To tackle the issue of limited training corpus, we incorporate transfer learning into the model training by utilizing both Jingju and Kunqu corpus. The experimental results from our self-built Kunqu corpus show that the incorporation of transfer learning led to a 6% improvement in the recognition rate of articulatory feature, and the average recognition rate of various articulatory features reached 83.7%, which is 24.4% better than phoneme recognition.