整合先验知识和数据驱动方法，改进韩语中的词素到音素转换

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Soft Computing Pub Date : 2024-08-19 DOI:10.1007/s00500-024-09934-2

Dezhi Cao, Yue Zhao, Licheng Wu

{"title":"整合先验知识和数据驱动方法，改进韩语中的词素到音素转换","authors":"Dezhi Cao, Yue Zhao, Licheng Wu","doi":"10.1007/s00500-024-09934-2","DOIUrl":null,"url":null,"abstract":"<p>Grapheme-to-phoneme (G2P) conversion technology is currently dominated by two methodologies: knowledge-based and data-based approaches. Knowledge-driven methods struggle to adapt to extensive datasets, while data-driven methods rely heavily on high-quality data and require precise feature selection for model construction. To address these challenges, this research aims to propose an integrated approach that combines prior knowledge with data-driven techniques for automatic G2P conversion in the Korean language. In this work, we extract attributes based on pronunciation rules and phonetic transformations between Korean words to construct a decision tree. Subsequently, the model is trained using a data-driven approach for automated phonetic transcription. The proposed integrated model achieves more accurate alignment between input and output variables, effectively capturing phonological variations in continuous Korean speech, and determining corresponding phonemes for graphemes. Rigorous cross-validation confirms its superiority, with an average accuracy of 94.63% in grapheme-to-phoneme conversion, outperforming existing methodologies. In conclusion, this research demonstrates the effectiveness of an integrated approach combining prior knowledge and data-driven techniques for G2P conversion in Korean. The high accuracy and performance of this method are significant for Korean G2P. Our approach can also be applied to low-resource or endangered languages that already have some linguistic research foundation to improve the accuracy of the pronunciation lexicon of the language.</p>","PeriodicalId":22039,"journal":{"name":"Soft Computing","volume":"26 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating prior knowledge and data-driven approaches for improving grapheme-to-phoneme conversion in Korean language\",\"authors\":\"Dezhi Cao, Yue Zhao, Licheng Wu\",\"doi\":\"10.1007/s00500-024-09934-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Grapheme-to-phoneme (G2P) conversion technology is currently dominated by two methodologies: knowledge-based and data-based approaches. Knowledge-driven methods struggle to adapt to extensive datasets, while data-driven methods rely heavily on high-quality data and require precise feature selection for model construction. To address these challenges, this research aims to propose an integrated approach that combines prior knowledge with data-driven techniques for automatic G2P conversion in the Korean language. In this work, we extract attributes based on pronunciation rules and phonetic transformations between Korean words to construct a decision tree. Subsequently, the model is trained using a data-driven approach for automated phonetic transcription. The proposed integrated model achieves more accurate alignment between input and output variables, effectively capturing phonological variations in continuous Korean speech, and determining corresponding phonemes for graphemes. Rigorous cross-validation confirms its superiority, with an average accuracy of 94.63% in grapheme-to-phoneme conversion, outperforming existing methodologies. In conclusion, this research demonstrates the effectiveness of an integrated approach combining prior knowledge and data-driven techniques for G2P conversion in Korean. The high accuracy and performance of this method are significant for Korean G2P. Our approach can also be applied to low-resource or endangered languages that already have some linguistic research foundation to improve the accuracy of the pronunciation lexicon of the language.</p>\",\"PeriodicalId\":22039,\"journal\":{\"name\":\"Soft Computing\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00500-024-09934-2\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00500-024-09934-2","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

词素到音素（G2P）转换技术目前主要有两种方法：基于知识的方法和基于数据的方法。知识驱动型方法难以适应广泛的数据集，而数据驱动型方法则严重依赖高质量数据，并需要为构建模型进行精确的特征选择。为了应对这些挑战，本研究旨在提出一种综合方法，将先验知识与数据驱动技术相结合，实现韩语 G2P 的自动转换。在这项工作中，我们根据发音规则和韩语单词之间的语音转换提取属性，构建决策树。随后，利用数据驱动方法对模型进行训练，以实现自动音标转写。所提出的综合模型实现了输入和输出变量之间更精确的对齐，有效捕捉了连续韩语语音中的语音变化，并为音素确定了相应的音素。严格的交叉验证证实了该模型的优越性，在词素到音素的转换中平均准确率达到 94.63%，优于现有方法。总之，这项研究证明了结合先验知识和数据驱动技术的综合方法在韩语 G2P 转换中的有效性。这种方法的高准确性和高性能对韩语 G2P 具有重要意义。我们的方法也可应用于低资源或濒危语言，这些语言已经有了一定的语言学研究基础，可以提高语言发音词典的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Integrating prior knowledge and data-driven approaches for improving grapheme-to-phoneme conversion in Korean language

查看原文本刊更多论文

Integrating prior knowledge and data-driven approaches for improving grapheme-to-phoneme conversion in Korean language

Grapheme-to-phoneme (G2P) conversion technology is currently dominated by two methodologies: knowledge-based and data-based approaches. Knowledge-driven methods struggle to adapt to extensive datasets, while data-driven methods rely heavily on high-quality data and require precise feature selection for model construction. To address these challenges, this research aims to propose an integrated approach that combines prior knowledge with data-driven techniques for automatic G2P conversion in the Korean language. In this work, we extract attributes based on pronunciation rules and phonetic transformations between Korean words to construct a decision tree. Subsequently, the model is trained using a data-driven approach for automated phonetic transcription. The proposed integrated model achieves more accurate alignment between input and output variables, effectively capturing phonological variations in continuous Korean speech, and determining corresponding phonemes for graphemes. Rigorous cross-validation confirms its superiority, with an average accuracy of 94.63% in grapheme-to-phoneme conversion, outperforming existing methodologies. In conclusion, this research demonstrates the effectiveness of an integrated approach combining prior knowledge and data-driven techniques for G2P conversion in Korean. The high accuracy and performance of this method are significant for Korean G2P. Our approach can also be applied to low-resource or endangered languages that already have some linguistic research foundation to improve the accuracy of the pronunciation lexicon of the language.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Soft Computing 工程技术-计算机：跨学科应用

CiteScore

8.10

自引率

9.80%

发文量

927

审稿时长

7.3 months

期刊介绍： Soft Computing is dedicated to system solutions based on soft computing techniques. It provides rapid dissemination of important results in soft computing technologies, a fusion of research in evolutionary algorithms and genetic programming, neural science and neural net systems, fuzzy set theory and fuzzy systems, and chaos theory and chaotic systems. Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. By linking the ideas and techniques of soft computing with other disciplines, the journal serves as a unifying platform that fosters comparisons, extensions, and new applications. As a result, the journal is an international forum for all scientists and engineers engaged in research and development in this fast growing field.