Phoneme segmentation using deep learning for speech synthesis

Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems Pub Date : 2018-10-09 DOI:10.1145/3264746.3264801

Young Han Lee, Jong-Yeol Yang, C. Cho, Hyedong Jung

引用次数: 3

Abstract

In this paper, we propose the phoneme segmentation method, which is one of the basic module that consist unit-selection-based speech synthesis, using deep learning algorithm. To enhance this, we apply the additional cross entropy loss into the Deep speech based speech recognition architecture. From this approach, we can get higher accuracy of phoneme boundary. In our experiments, the proposed method has 20.91 % boundary accuracy which is higher than the conventional phoneme segmentation.

查看原文本刊更多论文

语音合成中使用深度学习的音位分割

本文利用深度学习算法提出了基于单元选择的语音合成的基本模块之一——音素分割方法。为了增强这一点，我们将额外的交叉熵损失应用到基于深度语音的语音识别体系结构中。通过这种方法，我们可以得到更高的音素边界精度。实验结果表明，该方法的边界分割准确率为20.91%，高于传统的音素分割方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊