Automatic Speech Corpus Construction from Broadcasting Speech Databases

2010 International Conference on Computational Intelligence and Security Pub Date : 2010-12-11 DOI:10.1109/CIS.2010.145

Wei Zhang, R. Du, Minhui Pang, Qiuhong Wang

引用次数: 3

Abstract

The speech corpus often needs to be constructed frequently for the diversified speech synthesis. This paper discusses our efforts on construction of speech corpus automatically from broadcasting speech databases for trainable Text-To-Speech (TTS) system. We present a new framework of automatic speech corpus construction from broadcasting speech databases. We select the clean speech audios from the broadcasting audios with a music detector which is based on speech/music discrimination. An automatic speech sentence segmentation system is used to generate the sentence database from the clean speech audios. At last, a text corpus construction method selects appropriate sentences speech which is maximizing the coverage of the sentence database’s diphones. Experiments show that our method can generate a good speech corpus rapidly with minimum manual intervention.

查看原文本刊更多论文

基于广播语音数据库的自动语音语料库构建

多样化的语音合成往往需要频繁地构建语音语料库。本文讨论了从广播语音数据库中自动构建用于可训练文本到语音(TTS)系统的语音语料库的工作。提出了一种基于广播语音数据库的自动构建语音语料库的新框架。我们使用基于语音/音乐判别的音乐检测器从广播音频中选择干净的语音音频。利用语音自动句子分割系统，从干净的语音音频中生成句子数据库。最后，提出了一种文本语料库构建方法，选择合适的句子语音，最大限度地提高句子数据库语音的覆盖率。实验表明，该方法可以在最少人工干预的情况下快速生成良好的语音语料库。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 International Conference on Computational Intelligence and Security

自引率

0.00%

发文量