基于音节的延时神经网络(TDNNs)土耳其语语音识别系统

2013 International Conference on Soft Computing and Pattern Recognition (SoCPaR) Pub Date : 2013-12-01 DOI:10.1109/SOCPAR.2013.7054130

Burcu Can, Harun Artuner

{"title":"基于音节的延时神经网络(TDNNs)土耳其语语音识别系统","authors":"Burcu Can, Harun Artuner","doi":"10.1109/SOCPAR.2013.7054130","DOIUrl":null,"url":null,"abstract":"In this paper, we present a model for Turkish speech recognition. The model is syllable-based, where the recognition is performed through syllables as speech recognition units. The main goal of the model is to recognize as much as possible of a given continuous speech by identifying only a small set of syllables in the language. For that purpose, only the syllable types with a higher frequency are selected for the recognition. The use of longer recognition units in speech recognition systems increases the success of the recognition since it is easier to detect the endpoints of syllables when compared to phonemes. On the other side, word-based recognition requires a very large dataset that includes all the words and word forms in the language, which is also another challenge. Hereby, we take the advantage of Turkish being an ortographically transparent and syllabified language. Our model employs time delay neural networks (TDNNs) for learning syllables. We achieve an accuracy of %65.6 on our large vocabulary continuous speech corpus. In addition, we define an algorithm for the automatic detection of syllable boundaries which gives an accuracy of %44. The automatic syllable boundary detection module is used for the recognition of isolated syllables rather than a continuous speech.","PeriodicalId":315126,"journal":{"name":"2013 International Conference on Soft Computing and Pattern Recognition (SoCPaR)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"A syllable-based Turkish speech recognition system by using time delay neural networks (TDNNs)\",\"authors\":\"Burcu Can, Harun Artuner\",\"doi\":\"10.1109/SOCPAR.2013.7054130\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a model for Turkish speech recognition. The model is syllable-based, where the recognition is performed through syllables as speech recognition units. The main goal of the model is to recognize as much as possible of a given continuous speech by identifying only a small set of syllables in the language. For that purpose, only the syllable types with a higher frequency are selected for the recognition. The use of longer recognition units in speech recognition systems increases the success of the recognition since it is easier to detect the endpoints of syllables when compared to phonemes. On the other side, word-based recognition requires a very large dataset that includes all the words and word forms in the language, which is also another challenge. Hereby, we take the advantage of Turkish being an ortographically transparent and syllabified language. Our model employs time delay neural networks (TDNNs) for learning syllables. We achieve an accuracy of %65.6 on our large vocabulary continuous speech corpus. In addition, we define an algorithm for the automatic detection of syllable boundaries which gives an accuracy of %44. The automatic syllable boundary detection module is used for the recognition of isolated syllables rather than a continuous speech.\",\"PeriodicalId\":315126,\"journal\":{\"name\":\"2013 International Conference on Soft Computing and Pattern Recognition (SoCPaR)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on Soft Computing and Pattern Recognition (SoCPaR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SOCPAR.2013.7054130\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Soft Computing and Pattern Recognition (SoCPaR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SOCPAR.2013.7054130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

在本文中，我们提出了一个土耳其语语音识别模型。该模型是基于音节的，其中通过音节作为语音识别单元进行识别。该模型的主要目标是通过识别语言中的一小部分音节，尽可能多地识别给定的连续语音。为此，只选择频率较高的音节类型进行识别。在语音识别系统中使用较长的识别单元增加了识别的成功率，因为与音素相比，它更容易检测音节的端点。另一方面，基于单词的识别需要一个非常大的数据集，包括语言中所有的单词和单词形式，这也是另一个挑战。在此，我们利用土耳其语作为一种口头上透明和音节化的语言的优势。我们的模型采用延时神经网络(tdnn)来学习音节。在我们的大词汇量连续语料库上，我们达到了65.6的准确率。此外，我们定义了一种自动检测音节边界的算法，其准确率为%44。自动音节边界检测模块用于识别孤立的音节而不是连续的语音。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A syllable-based Turkish speech recognition system by using time delay neural networks (TDNNs)

In this paper, we present a model for Turkish speech recognition. The model is syllable-based, where the recognition is performed through syllables as speech recognition units. The main goal of the model is to recognize as much as possible of a given continuous speech by identifying only a small set of syllables in the language. For that purpose, only the syllable types with a higher frequency are selected for the recognition. The use of longer recognition units in speech recognition systems increases the success of the recognition since it is easier to detect the endpoints of syllables when compared to phonemes. On the other side, word-based recognition requires a very large dataset that includes all the words and word forms in the language, which is also another challenge. Hereby, we take the advantage of Turkish being an ortographically transparent and syllabified language. Our model employs time delay neural networks (TDNNs) for learning syllables. We achieve an accuracy of %65.6 on our large vocabulary continuous speech corpus. In addition, we define an algorithm for the automatic detection of syllable boundaries which gives an accuracy of %44. The automatic syllable boundary detection module is used for the recognition of isolated syllables rather than a continuous speech.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 International Conference on Soft Computing and Pattern Recognition (SoCPaR)

自引率

0.00%

发文量