{"title":"基于人工智能和语料库的英语口语CAF优化与评价","authors":"Wenfang Zhang, Xiaodong Wang","doi":"10.23977/jaip.2023.060506","DOIUrl":null,"url":null,"abstract":": English is the most widely used language in the world, and the pronunciation of its spoken language is equally important. The traditional methods are not high in complexity, accuracy and fluency (CAF) for spoken English recognition. Therefore, it is very important to use AI and corpus to optimize and evaluate spoken English CAF. This paper aims to study the optimization and evaluation of spoken English CAF using AI and corpus, and proposes to use the Hidden Markov (HMM) model and convolutional neural network (CNN) model in the field of AI to optimize and evaluate spoken English CAF. By selecting a variety of English voices from the BNC corpus for model training and testing, and selecting the complexity, accuracy, fluency and harmonic average of the CNN model recognition as evaluation indicators, the HMM model's recognition spectrogram is added up and analyzed. In the experimental test, it was found that when the number of frames is 210, the indicators of the CNN model have been greatly improved, so the number of frames selected for the test in this paper is 210. The results show that the A value obtained by the HMM model test is about 85%, the CNN model is 67%, and the traditional SVM model is only 35%. The HMM model is tested with a C value of about 60%, the CNN model is 65%, and the traditional model is only 45%. The F-value obtained from the test of the HMM model is about 83%, the CNN model is 67%, and the traditional model is 46%. In contrast, the HMM model has higher recognition accuracy for spoken English, and the recognition results are more fluent. However, the CNN model can recognize spoken English with higher complexity, and both the CNN model and the HMM model can improve the CAF optimization effect of spoken English.","PeriodicalId":293823,"journal":{"name":"Journal of Artificial Intelligence Practice","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimization and Evaluation of Spoken English CAF Based on Artificial Intelligence and Corpus\",\"authors\":\"Wenfang Zhang, Xiaodong Wang\",\"doi\":\"10.23977/jaip.2023.060506\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": English is the most widely used language in the world, and the pronunciation of its spoken language is equally important. The traditional methods are not high in complexity, accuracy and fluency (CAF) for spoken English recognition. Therefore, it is very important to use AI and corpus to optimize and evaluate spoken English CAF. This paper aims to study the optimization and evaluation of spoken English CAF using AI and corpus, and proposes to use the Hidden Markov (HMM) model and convolutional neural network (CNN) model in the field of AI to optimize and evaluate spoken English CAF. By selecting a variety of English voices from the BNC corpus for model training and testing, and selecting the complexity, accuracy, fluency and harmonic average of the CNN model recognition as evaluation indicators, the HMM model's recognition spectrogram is added up and analyzed. In the experimental test, it was found that when the number of frames is 210, the indicators of the CNN model have been greatly improved, so the number of frames selected for the test in this paper is 210. The results show that the A value obtained by the HMM model test is about 85%, the CNN model is 67%, and the traditional SVM model is only 35%. The HMM model is tested with a C value of about 60%, the CNN model is 65%, and the traditional model is only 45%. The F-value obtained from the test of the HMM model is about 83%, the CNN model is 67%, and the traditional model is 46%. In contrast, the HMM model has higher recognition accuracy for spoken English, and the recognition results are more fluent. However, the CNN model can recognize spoken English with higher complexity, and both the CNN model and the HMM model can improve the CAF optimization effect of spoken English.\",\"PeriodicalId\":293823,\"journal\":{\"name\":\"Journal of Artificial Intelligence Practice\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Artificial Intelligence Practice\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23977/jaip.2023.060506\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence Practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23977/jaip.2023.060506","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimization and Evaluation of Spoken English CAF Based on Artificial Intelligence and Corpus
: English is the most widely used language in the world, and the pronunciation of its spoken language is equally important. The traditional methods are not high in complexity, accuracy and fluency (CAF) for spoken English recognition. Therefore, it is very important to use AI and corpus to optimize and evaluate spoken English CAF. This paper aims to study the optimization and evaluation of spoken English CAF using AI and corpus, and proposes to use the Hidden Markov (HMM) model and convolutional neural network (CNN) model in the field of AI to optimize and evaluate spoken English CAF. By selecting a variety of English voices from the BNC corpus for model training and testing, and selecting the complexity, accuracy, fluency and harmonic average of the CNN model recognition as evaluation indicators, the HMM model's recognition spectrogram is added up and analyzed. In the experimental test, it was found that when the number of frames is 210, the indicators of the CNN model have been greatly improved, so the number of frames selected for the test in this paper is 210. The results show that the A value obtained by the HMM model test is about 85%, the CNN model is 67%, and the traditional SVM model is only 35%. The HMM model is tested with a C value of about 60%, the CNN model is 65%, and the traditional model is only 45%. The F-value obtained from the test of the HMM model is about 83%, the CNN model is 67%, and the traditional model is 46%. In contrast, the HMM model has higher recognition accuracy for spoken English, and the recognition results are more fluent. However, the CNN model can recognize spoken English with higher complexity, and both the CNN model and the HMM model can improve the CAF optimization effect of spoken English.