基于声标志鲁棒提取的波斯语电话识别

Shaghayegh Reza, S. Seyyedsalehi, Seyyedeh Zohreh Seyyedsalehi
{"title":"基于声标志鲁棒提取的波斯语电话识别","authors":"Shaghayegh Reza, S. Seyyedsalehi, Seyyedeh Zohreh Seyyedsalehi","doi":"10.1109/ICBME51989.2020.9319436","DOIUrl":null,"url":null,"abstract":"Acoustic landmarks are defined as more informative parts of the speech signal and are proofed to be beneficial in designing more robust speech recognition systems. This work aims to present a Persian phone recognition system based on acoustic landmarks to achieve a quality phone recognition system. For this, appropriate acoustic landmarks for the Persian language was selected and trained to an artificial neural network. Then to boost the performance of our landmark recognition system, the model's structure and the training method were modified. The goal of these modifications is to filter variations of acoustic landmarks as much as possible. For this, we utilized neural network structures to map landmarks to their corresponding gold ones nonlinearly. These gold landmarks are the ones that could be recognized without any error in our landmark recognition system. The experiments were implemented on a Persian database named Farsdat. The best landmark recognition model is a five-hidden layer feedforward neural network with 21.74 phone error rate. We also attained 0.56 percent PER improvement using our best variation filtering method.","PeriodicalId":120969,"journal":{"name":"2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Persian Language Phone Recognition Based on Robust Extraction of Acoustic Landmarks\",\"authors\":\"Shaghayegh Reza, S. Seyyedsalehi, Seyyedeh Zohreh Seyyedsalehi\",\"doi\":\"10.1109/ICBME51989.2020.9319436\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Acoustic landmarks are defined as more informative parts of the speech signal and are proofed to be beneficial in designing more robust speech recognition systems. This work aims to present a Persian phone recognition system based on acoustic landmarks to achieve a quality phone recognition system. For this, appropriate acoustic landmarks for the Persian language was selected and trained to an artificial neural network. Then to boost the performance of our landmark recognition system, the model's structure and the training method were modified. The goal of these modifications is to filter variations of acoustic landmarks as much as possible. For this, we utilized neural network structures to map landmarks to their corresponding gold ones nonlinearly. These gold landmarks are the ones that could be recognized without any error in our landmark recognition system. The experiments were implemented on a Persian database named Farsdat. The best landmark recognition model is a five-hidden layer feedforward neural network with 21.74 phone error rate. We also attained 0.56 percent PER improvement using our best variation filtering method.\",\"PeriodicalId\":120969,\"journal\":{\"name\":\"2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICBME51989.2020.9319436\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBME51989.2020.9319436","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

声学标志被定义为语音信号中信息量更大的部分,并被证明对设计更健壮的语音识别系统是有益的。本工作旨在提出一种基于声学地标的波斯语电话识别系统,以实现高质量的电话识别系统。为此,选择适合波斯语的声学标志,并将其训练到人工神经网络中。然后对模型的结构和训练方法进行了改进,以提高系统的性能。这些修改的目标是尽可能多地过滤声学标志的变化。为此,我们利用神经网络结构将地标非线性地映射到相应的金色地标。在我们的地标识别系统中,这些黄金地标是可以准确识别的。这些实验是在一个名为Farsdat的波斯语数据库上进行的。最佳的地标识别模型是五隐层前馈神经网络,错误率为21.74。使用我们的最佳变异过滤方法,我们也获得了0.56%的PER改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Persian Language Phone Recognition Based on Robust Extraction of Acoustic Landmarks
Acoustic landmarks are defined as more informative parts of the speech signal and are proofed to be beneficial in designing more robust speech recognition systems. This work aims to present a Persian phone recognition system based on acoustic landmarks to achieve a quality phone recognition system. For this, appropriate acoustic landmarks for the Persian language was selected and trained to an artificial neural network. Then to boost the performance of our landmark recognition system, the model's structure and the training method were modified. The goal of these modifications is to filter variations of acoustic landmarks as much as possible. For this, we utilized neural network structures to map landmarks to their corresponding gold ones nonlinearly. These gold landmarks are the ones that could be recognized without any error in our landmark recognition system. The experiments were implemented on a Persian database named Farsdat. The best landmark recognition model is a five-hidden layer feedforward neural network with 21.74 phone error rate. We also attained 0.56 percent PER improvement using our best variation filtering method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信