Horderlin Vrangel Robles Vega, V. Molina, Luis Martinez
{"title":"基于能量和谱域的VAD算法在河床卡斯蒂利亚语中的应用","authors":"Horderlin Vrangel Robles Vega, V. Molina, Luis Martinez","doi":"10.1109/STSIVA.2016.7743346","DOIUrl":null,"url":null,"abstract":"Because the English and Castilian have marked acoustic and phonetic differences, this paper shows the study of the effectiveness of different algorithms VAD (Voice Activity Detection) literature, applied to the Castilian, especially riplatense. This article is intended to publicize the results achieved to date. In the first part of the document briefly explained the three implemented methods, namely the autocorrelation function short time (STACF), the average magnitude of the differential junction (FDMA) and the linear prediction coefficients (LPC). Immediately, tests and experiments with BEPPA battery to evaluate the effectiveness of these algorithms VAD will be described. In this step 10 sentences were applied in selected Rioplatense Spanish BEPPA battery of each VAD to detect sound segments, they were used without voice and silence. Immediately, the results obtained in the experimental phase is disclosed, evaluate classifications using the confusion matrix of the 10 phrases in 65 words were about 40 segments of silence. Finally, conclusions and future work are described. Clearly that shows that the algorithms have not been implemented show overall efficiency in detecting voice activity in Spanish of the Rio de la Plata. We also found that the algorithms implemented using linear prediction coefficients show better performance.","PeriodicalId":373420,"journal":{"name":"2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"VAD algorithms energy-based and spectral-domain applied in River Plate Castilian\",\"authors\":\"Horderlin Vrangel Robles Vega, V. Molina, Luis Martinez\",\"doi\":\"10.1109/STSIVA.2016.7743346\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Because the English and Castilian have marked acoustic and phonetic differences, this paper shows the study of the effectiveness of different algorithms VAD (Voice Activity Detection) literature, applied to the Castilian, especially riplatense. This article is intended to publicize the results achieved to date. In the first part of the document briefly explained the three implemented methods, namely the autocorrelation function short time (STACF), the average magnitude of the differential junction (FDMA) and the linear prediction coefficients (LPC). Immediately, tests and experiments with BEPPA battery to evaluate the effectiveness of these algorithms VAD will be described. In this step 10 sentences were applied in selected Rioplatense Spanish BEPPA battery of each VAD to detect sound segments, they were used without voice and silence. Immediately, the results obtained in the experimental phase is disclosed, evaluate classifications using the confusion matrix of the 10 phrases in 65 words were about 40 segments of silence. Finally, conclusions and future work are described. Clearly that shows that the algorithms have not been implemented show overall efficiency in detecting voice activity in Spanish of the Rio de la Plata. We also found that the algorithms implemented using linear prediction coefficients show better performance.\",\"PeriodicalId\":373420,\"journal\":{\"name\":\"2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/STSIVA.2016.7743346\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/STSIVA.2016.7743346","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
由于英语和卡斯蒂利亚语具有明显的声学和语音差异,本文展示了研究不同算法VAD (Voice Activity Detection,语音活动检测)文献的有效性,将其应用于卡斯蒂利亚语,特别是利普拉特语。这篇文章旨在宣传迄今取得的成果。本文第一部分简要介绍了三种实现方法,即短时间自相关函数(STACF)、差分结平均幅值(FDMA)和线性预测系数(LPC)。接下来,将用BEPPA电池进行测试和实验,以评估这些算法的有效性。在这一步中,在每个VAD中选择的Rioplatense Spanish BEPPA电池中使用10个句子来检测音段,它们不使用voice和silence。立即公开实验阶段获得的结果,利用混淆矩阵对65个单词中的10个短语进行约40段沉默的评价分类。最后,对结论和未来的工作进行了描述。很明显,这表明算法在检测里约热内卢de la Plata的西班牙语语音活动方面并没有得到实现。我们还发现使用线性预测系数实现的算法表现出更好的性能。
VAD algorithms energy-based and spectral-domain applied in River Plate Castilian
Because the English and Castilian have marked acoustic and phonetic differences, this paper shows the study of the effectiveness of different algorithms VAD (Voice Activity Detection) literature, applied to the Castilian, especially riplatense. This article is intended to publicize the results achieved to date. In the first part of the document briefly explained the three implemented methods, namely the autocorrelation function short time (STACF), the average magnitude of the differential junction (FDMA) and the linear prediction coefficients (LPC). Immediately, tests and experiments with BEPPA battery to evaluate the effectiveness of these algorithms VAD will be described. In this step 10 sentences were applied in selected Rioplatense Spanish BEPPA battery of each VAD to detect sound segments, they were used without voice and silence. Immediately, the results obtained in the experimental phase is disclosed, evaluate classifications using the confusion matrix of the 10 phrases in 65 words were about 40 segments of silence. Finally, conclusions and future work are described. Clearly that shows that the algorithms have not been implemented show overall efficiency in detecting voice activity in Spanish of the Rio de la Plata. We also found that the algorithms implemented using linear prediction coefficients show better performance.