International Journal of Speech Technology最新文献

筛选
英文 中文
Fusion of speech and handwritten signatures biometrics for person identification 融合语音和手写签名的生物识别技术
International Journal of Speech Technology Pub Date : 2023-11-01 DOI: 10.1007/s10772-023-10052-x
Ahmad A. M. Abushariah, Mohammad A. M. Abushariah, Teddy Surya Gunawan, J. Chebil, Assal A. M. Alqudah, Hua-Nong Ting, Mumtaz Begum Peer Mustafa
{"title":"Fusion of speech and handwritten signatures biometrics for person identification","authors":"Ahmad A. M. Abushariah, Mohammad A. M. Abushariah, Teddy Surya Gunawan, J. Chebil, Assal A. M. Alqudah, Hua-Nong Ting, Mumtaz Begum Peer Mustafa","doi":"10.1007/s10772-023-10052-x","DOIUrl":"https://doi.org/10.1007/s10772-023-10052-x","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135325537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid adaptive neuro-fuzzy approach for automatic spoken digit recognition 语音数字自动识别的混合自适应神经模糊方法
International Journal of Speech Technology Pub Date : 2023-10-31 DOI: 10.1007/s10772-023-10057-6
Irshed Hussain, Pinki Roy
{"title":"A hybrid adaptive neuro-fuzzy approach for automatic spoken digit recognition","authors":"Irshed Hussain, Pinki Roy","doi":"10.1007/s10772-023-10057-6","DOIUrl":"https://doi.org/10.1007/s10772-023-10057-6","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"55 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135872500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-task learning for X-vector based speaker recognition 基于x向量的说话人识别多任务学习
International Journal of Speech Technology Pub Date : 2023-10-28 DOI: 10.1007/s10772-023-10058-5
Yingjie Zhang, Liu Liu
{"title":"Multi-task learning for X-vector based speaker recognition","authors":"Yingjie Zhang, Liu Liu","doi":"10.1007/s10772-023-10058-5","DOIUrl":"https://doi.org/10.1007/s10772-023-10058-5","url":null,"abstract":"Abstract In this paper, we propose a speaker recognition system that leverages multi-task learning and features integration (MTFI), to improve the performance of x-vector based speaker recognition models. It is important to integrate complementary information from different features such as MFCC, Fbank, spectrogram and LPCC, as often a single feature usually cannot cover all information about a speaker and generalization is insufficient. Since the x-vector model outputs affine transformation values with the penultimate hidden layer in the trained model, the parameter distribution of this layer should be stable and should not be affected by tasks that are not current branches when switching tasks. Therefore, we propose a shared unit (SU) in multi-task learning, which is useful for sharing common representations and other auxiliary tasks. Then, an attention mechanism is designed to calculate the frame weight in the statistical pooling layer, so as to enhance the key frame information. The proposed system had an EER of 0.98% in voxceleb1 and the average score fusion obtained the EER of 0.65%.","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136158369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised spoken term discovery using pseudo lexical induction 使用伪词法归纳的无监督口语术语发现
International Journal of Speech Technology Pub Date : 2023-10-26 DOI: 10.1007/s10772-023-10049-6
P. Sudhakar, K. Sreenivasa Rao, Pabitra Mitra
{"title":"Unsupervised spoken term discovery using pseudo lexical induction","authors":"P. Sudhakar, K. Sreenivasa Rao, Pabitra Mitra","doi":"10.1007/s10772-023-10049-6","DOIUrl":"https://doi.org/10.1007/s10772-023-10049-6","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134910240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bird species recognition using spiking neural network along with distance based fuzzy co-clustering 基于脉冲神经网络和基于距离的模糊共聚类的鸟类物种识别
International Journal of Speech Technology Pub Date : 2023-09-13 DOI: 10.1007/s10772-023-10040-1
Ricky Mohanty, Hemanta Kumar Bhuyan, Subhendu Kumar Pani, Vinayakumar Ravi, Moez Krichen
{"title":"Bird species recognition using spiking neural network along with distance based fuzzy co-clustering","authors":"Ricky Mohanty, Hemanta Kumar Bhuyan, Subhendu Kumar Pani, Vinayakumar Ravi, Moez Krichen","doi":"10.1007/s10772-023-10040-1","DOIUrl":"https://doi.org/10.1007/s10772-023-10040-1","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135781930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards modeling raw speech in gender identification of children using sincNet over ERB scale 在ERB量表上使用sincNet对儿童性别识别中的原始语音建模
International Journal of Speech Technology Pub Date : 2023-09-08 DOI: 10.1007/s10772-023-10039-8
Kodali Radha, Mohan Bansal
{"title":"Towards modeling raw speech in gender identification of children using sincNet over ERB scale","authors":"Kodali Radha, Mohan Bansal","doi":"10.1007/s10772-023-10039-8","DOIUrl":"https://doi.org/10.1007/s10772-023-10039-8","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46612698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of probabilistic neural network for speech emotion recognition 概率神经网络在语音情感识别中的应用
International Journal of Speech Technology Pub Date : 2023-09-06 DOI: 10.1007/s10772-023-10037-w
Shrikala Deshmukh, Preeti Gupta
{"title":"Application of probabilistic neural network for speech emotion recognition","authors":"Shrikala Deshmukh, Preeti Gupta","doi":"10.1007/s10772-023-10037-w","DOIUrl":"https://doi.org/10.1007/s10772-023-10037-w","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47314222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic age recognition, call-type classification, and speaker identification of Zebra Finches (Taeniopygia guttata) using hidden Markov models (HMMs) 使用隐马尔可夫模型(HMM)对斑蝥的自动年龄识别、叫声类型分类和说话人识别
International Journal of Speech Technology Pub Date : 2023-09-04 DOI: 10.1007/s10772-023-10041-0
Marek B. Trawicki
{"title":"Automatic age recognition, call-type classification, and speaker identification of Zebra Finches (Taeniopygia guttata) using hidden Markov models (HMMs)","authors":"Marek B. Trawicki","doi":"10.1007/s10772-023-10041-0","DOIUrl":"https://doi.org/10.1007/s10772-023-10041-0","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47095049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speech signal analysis and enhancement using combined wavelet Fourier transform with stacked deep learning architecture 基于小波傅里叶变换和堆叠深度学习结构的语音信号分析与增强
International Journal of Speech Technology Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10044-x
V. Srinivasarao
{"title":"Speech signal analysis and enhancement using combined wavelet Fourier transform with stacked deep learning architecture","authors":"V. Srinivasarao","doi":"10.1007/s10772-023-10044-x","DOIUrl":"https://doi.org/10.1007/s10772-023-10044-x","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135641306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning structure for emotion prediction using MFCC from native languages 使用来自母语的MFCC进行情感预测的深度学习结构
International Journal of Speech Technology Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10047-8
A. Suresh Rao, A. Pramod Reddy, Pragathi Vulpala, K. Shwetha Rani, P. Hemalatha
{"title":"Deep learning structure for emotion prediction using MFCC from native languages","authors":"A. Suresh Rao, A. Pramod Reddy, Pragathi Vulpala, K. Shwetha Rani, P. Hemalatha","doi":"10.1007/s10772-023-10047-8","DOIUrl":"https://doi.org/10.1007/s10772-023-10047-8","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135640647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信