WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages

IF 7.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Tripti Choudhary;Vishal Goyal;Atul Bansal
{"title":"WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages","authors":"Tripti Choudhary;Vishal Goyal;Atul Bansal","doi":"10.26599/BDMA.2022.9020017","DOIUrl":null,"url":null,"abstract":"Automatic speech recognition systems are developed for translating the speech signals into the corresponding text representation. This translation is used in a variety of applications like voice enabled commands, assistive devices and bots, etc. There is a significant lack of efficient technology for Indian languages. In this paper, an wavelet transformer for automatic speech recognition (WTASR) of Indian language is proposed. The speech signals suffer from the problem of high and low frequency over different times due to variation in speech of the speaker. Thus, wavelets enable the network to analyze the signal in multiscale. The wavelet decomposition of the signal is fed in the network for generating the text. The transformer network comprises an encoder decoder system for speech translation. The model is trained on Indian language dataset for translation of speech into corresponding text. The proposed method is compared with other state of the art methods. The results show that the proposed WTASR has a low word error rate and can be used for effective speech recognition for Indian language.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"6 1","pages":"85-91"},"PeriodicalIF":7.7000,"publicationDate":"2022-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/9962810/09962811.pdf","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Mining and Analytics","FirstCategoryId":"1093","ListUrlMain":"https://ieeexplore.ieee.org/document/9962811/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 4

Abstract

Automatic speech recognition systems are developed for translating the speech signals into the corresponding text representation. This translation is used in a variety of applications like voice enabled commands, assistive devices and bots, etc. There is a significant lack of efficient technology for Indian languages. In this paper, an wavelet transformer for automatic speech recognition (WTASR) of Indian language is proposed. The speech signals suffer from the problem of high and low frequency over different times due to variation in speech of the speaker. Thus, wavelets enable the network to analyze the signal in multiscale. The wavelet decomposition of the signal is fed in the network for generating the text. The transformer network comprises an encoder decoder system for speech translation. The model is trained on Indian language dataset for translation of speech into corresponding text. The proposed method is compared with other state of the art methods. The results show that the proposed WTASR has a low word error rate and can be used for effective speech recognition for Indian language.
WTASR:用于印度语语音自动识别的小波变换器
开发了用于将语音信号翻译成相应文本表示的自动语音识别系统。这种翻译用于各种应用程序,如语音命令、辅助设备和机器人等。印度语言严重缺乏有效的技术。本文提出了一种用于印度语自动语音识别的小波变换器。由于讲话者的语音变化,语音信号在不同时间上存在高频和低频的问题。因此,小波使网络能够以多尺度分析信号。信号的小波分解被馈送到网络中用于生成文本。变换器网络包括用于语音翻译的编码器-解码器系统。该模型在印度语言数据集上进行训练,用于将语音翻译成相应的文本。将所提出的方法与其他现有技术的方法进行比较。结果表明,所提出的WTASR具有较低的误字率,可用于印度语的有效语音识别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Big Data Mining and Analytics
Big Data Mining and Analytics Computer Science-Computer Science Applications
CiteScore
20.90
自引率
2.20%
发文量
84
期刊介绍: Big Data Mining and Analytics, a publication by Tsinghua University Press, presents groundbreaking research in the field of big data research and its applications. This comprehensive book delves into the exploration and analysis of vast amounts of data from diverse sources to uncover hidden patterns, correlations, insights, and knowledge. Featuring the latest developments, research issues, and solutions, this book offers valuable insights into the world of big data. It provides a deep understanding of data mining techniques, data analytics, and their practical applications. Big Data Mining and Analytics has gained significant recognition and is indexed and abstracted in esteemed platforms such as ESCI, EI, Scopus, DBLP Computer Science, Google Scholar, INSPEC, CSCD, DOAJ, CNKI, and more. With its wealth of information and its ability to transform the way we perceive and utilize data, this book is a must-read for researchers, professionals, and anyone interested in the field of big data analytics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信