多说话人连续语音中音素边界的无监督发现

2011 IEEE International Conference on Development and Learning (ICDL) Pub Date : 2011-10-10 DOI:10.1109/DEVLRN.2011.6037316

T. Armstrong, Stephanie Antetomaso

{"title":"多说话人连续语音中音素边界的无监督发现","authors":"T. Armstrong, Stephanie Antetomaso","doi":"10.1109/DEVLRN.2011.6037316","DOIUrl":null,"url":null,"abstract":"Children rapidly learn the inventory of phonemes used in their native tongues. Computational approaches to learning phoneme boundaries from speech data do not yet reach the level of human performance. We present an algorithm that operates on, qualitatively, similar data to those children receive: natural language utterances from multiple speakers. Our algorithm is unsupervised and discovers phoneme boundary positions in speech. The approach draws inspiration from the word and text segmentation literature. To demonstrate the efficacy of our algorithm on speech data, we present empirical results of our method using the TIMIT data set. Our method achieves F-measure scores in the 0.68 – 0.73 range for locating phoneme boundary positions.","PeriodicalId":256921,"journal":{"name":"2011 IEEE International Conference on Development and Learning (ICDL)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Unsupervised discovery of phoneme boundaries in multi-speaker continuous speech\",\"authors\":\"T. Armstrong, Stephanie Antetomaso\",\"doi\":\"10.1109/DEVLRN.2011.6037316\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Children rapidly learn the inventory of phonemes used in their native tongues. Computational approaches to learning phoneme boundaries from speech data do not yet reach the level of human performance. We present an algorithm that operates on, qualitatively, similar data to those children receive: natural language utterances from multiple speakers. Our algorithm is unsupervised and discovers phoneme boundary positions in speech. The approach draws inspiration from the word and text segmentation literature. To demonstrate the efficacy of our algorithm on speech data, we present empirical results of our method using the TIMIT data set. Our method achieves F-measure scores in the 0.68 – 0.73 range for locating phoneme boundary positions.\",\"PeriodicalId\":256921,\"journal\":{\"name\":\"2011 IEEE International Conference on Development and Learning (ICDL)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE International Conference on Development and Learning (ICDL)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DEVLRN.2011.6037316\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Development and Learning (ICDL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEVLRN.2011.6037316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

孩子们很快就学会了母语中使用的音素清单。从语音数据中学习音素边界的计算方法尚未达到人类表现的水平。我们提出了一种算法，定性地处理这些孩子接收到的类似数据:来自多个说话者的自然语言话语。我们的算法是无监督的，并发现语音中的音素边界位置。该方法从词和文本分词文献中获得灵感。为了证明我们的算法在语音数据上的有效性，我们使用TIMIT数据集给出了我们的方法的实证结果。我们的方法在定位音素边界位置的F-measure得分范围在0.68 - 0.73之间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Unsupervised discovery of phoneme boundaries in multi-speaker continuous speech

Children rapidly learn the inventory of phonemes used in their native tongues. Computational approaches to learning phoneme boundaries from speech data do not yet reach the level of human performance. We present an algorithm that operates on, qualitatively, similar data to those children receive: natural language utterances from multiple speakers. Our algorithm is unsupervised and discovers phoneme boundary positions in speech. The approach draws inspiration from the word and text segmentation literature. To demonstrate the efficacy of our algorithm on speech data, we present empirical results of our method using the TIMIT data set. Our method achieves F-measure scores in the 0.68 – 0.73 range for locating phoneme boundary positions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 IEEE International Conference on Development and Learning (ICDL)

自引率

0.00%

发文量