基于人工神经网络的印度语文本依赖多语说话人识别

2010 3rd International Conference on Emerging Trends in Engineering and Technology Pub Date : 2010-11-19 DOI:10.1109/ICETET.2010.23

Rajesh Ranjan, S. Singh, A. Shukla, R. Tiwari

{"title":"基于人工神经网络的印度语文本依赖多语说话人识别","authors":"Rajesh Ranjan, S. Singh, A. Shukla, R. Tiwari","doi":"10.1109/ICETET.2010.23","DOIUrl":null,"url":null,"abstract":"In this paper an attempt is made to develop speaker identification system which is used to determine the identity of an unknown speaker among several speakers of known speech characteristics, from a sample of his or her voice. Every speaker has different individual characteristics embedded in his /her speech utterances. These characteristics can be extracted from utterances and different neural network models are used to get the desired results. To evaluate speech characteristics from utterances they are stored in digitized form. Speech features namely LPC, RC, APSD, Number of zero crossing and Formant frequencies are extracted from speech signal and formed speech feature vectors. These data features are fed into Artificial Neural Network using back propagation learning algorithm and clustering algorithm for training and identification processes of different speakers. The database used for this system consists of 20 speakers including both male and female from different parts of India and languages are Hindi, Sanskrit, Punjabi and Telugu. The average identification rate 83.29% is achieved when the network is trained using back propagation algorithm and it is improved by about 9% and reached up to 92.78% when using clustering algorithm.","PeriodicalId":175615,"journal":{"name":"2010 3rd International Conference on Emerging Trends in Engineering and Technology","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Text-Dependent Multilingual Speaker Identification for Indian Languages Using Artificial Neural Network\",\"authors\":\"Rajesh Ranjan, S. Singh, A. Shukla, R. Tiwari\",\"doi\":\"10.1109/ICETET.2010.23\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper an attempt is made to develop speaker identification system which is used to determine the identity of an unknown speaker among several speakers of known speech characteristics, from a sample of his or her voice. Every speaker has different individual characteristics embedded in his /her speech utterances. These characteristics can be extracted from utterances and different neural network models are used to get the desired results. To evaluate speech characteristics from utterances they are stored in digitized form. Speech features namely LPC, RC, APSD, Number of zero crossing and Formant frequencies are extracted from speech signal and formed speech feature vectors. These data features are fed into Artificial Neural Network using back propagation learning algorithm and clustering algorithm for training and identification processes of different speakers. The database used for this system consists of 20 speakers including both male and female from different parts of India and languages are Hindi, Sanskrit, Punjabi and Telugu. The average identification rate 83.29% is achieved when the network is trained using back propagation algorithm and it is improved by about 9% and reached up to 92.78% when using clustering algorithm.\",\"PeriodicalId\":175615,\"journal\":{\"name\":\"2010 3rd International Conference on Emerging Trends in Engineering and Technology\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 3rd International Conference on Emerging Trends in Engineering and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICETET.2010.23\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 3rd International Conference on Emerging Trends in Engineering and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICETET.2010.23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

本文试图开发一个说话人识别系统，该系统可以从一个说话人的声音样本中确定一个未知说话人的身份。每个说话者都有不同的个人特征嵌入在他/她的言语中。这些特征可以从话语中提取出来，并使用不同的神经网络模型来获得期望的结果。为了从话语中评估语音特征，它们被存储在数字化形式中。从语音信号中提取语音特征LPC、RC、APSD、过零数和峰频率，形成语音特征向量。利用反向传播学习算法和聚类算法将这些数据特征输入到人工神经网络中，进行不同说话人的训练和识别过程。该系统使用的数据库由来自印度不同地区的20名发言者组成，包括男性和女性，语言有印地语、梵语、旁遮普语和泰卢固语。使用反向传播算法训练网络的平均识别率为83.29%，使用聚类算法训练网络的识别率提高了约9%，达到92.78%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Text-Dependent Multilingual Speaker Identification for Indian Languages Using Artificial Neural Network

In this paper an attempt is made to develop speaker identification system which is used to determine the identity of an unknown speaker among several speakers of known speech characteristics, from a sample of his or her voice. Every speaker has different individual characteristics embedded in his /her speech utterances. These characteristics can be extracted from utterances and different neural network models are used to get the desired results. To evaluate speech characteristics from utterances they are stored in digitized form. Speech features namely LPC, RC, APSD, Number of zero crossing and Formant frequencies are extracted from speech signal and formed speech feature vectors. These data features are fed into Artificial Neural Network using back propagation learning algorithm and clustering algorithm for training and identification processes of different speakers. The database used for this system consists of 20 speakers including both male and female from different parts of India and languages are Hindi, Sanskrit, Punjabi and Telugu. The average identification rate 83.29% is achieved when the network is trained using back propagation algorithm and it is improved by about 9% and reached up to 92.78% when using clustering algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 3rd International Conference on Emerging Trends in Engineering and Technology

自引率

0.00%

发文量