Unsupervised discovery of phoneme boundaries in multi-speaker continuous speech

2011 IEEE International Conference on Development and Learning (ICDL) Pub Date : 2011-10-10 DOI:10.1109/DEVLRN.2011.6037316

T. Armstrong, Stephanie Antetomaso

引用次数: 2

Abstract

Children rapidly learn the inventory of phonemes used in their native tongues. Computational approaches to learning phoneme boundaries from speech data do not yet reach the level of human performance. We present an algorithm that operates on, qualitatively, similar data to those children receive: natural language utterances from multiple speakers. Our algorithm is unsupervised and discovers phoneme boundary positions in speech. The approach draws inspiration from the word and text segmentation literature. To demonstrate the efficacy of our algorithm on speech data, we present empirical results of our method using the TIMIT data set. Our method achieves F-measure scores in the 0.68 – 0.73 range for locating phoneme boundary positions.

查看原文本刊更多论文

多说话人连续语音中音素边界的无监督发现

孩子们很快就学会了母语中使用的音素清单。从语音数据中学习音素边界的计算方法尚未达到人类表现的水平。我们提出了一种算法，定性地处理这些孩子接收到的类似数据:来自多个说话者的自然语言话语。我们的算法是无监督的，并发现语音中的音素边界位置。该方法从词和文本分词文献中获得灵感。为了证明我们的算法在语音数据上的有效性，我们使用TIMIT数据集给出了我们的方法的实证结果。我们的方法在定位音素边界位置的F-measure得分范围在0.68 - 0.73之间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE International Conference on Development and Learning (ICDL)

自引率

0.00%

发文量