International Journal of Speech Technology最新文献

筛选
英文 中文
An approach for speech enhancement with dysarthric speech recognition using optimization based machine learning frameworks 一种基于优化的机器学习框架的语音识别增强方法
International Journal of Speech Technology Pub Date : 2023-02-21 DOI: 10.1007/s10772-023-10019-y
Bhuvaneshwari Jolad, Rajashri Khanai
{"title":"An approach for speech enhancement with dysarthric speech recognition using optimization based machine learning frameworks","authors":"Bhuvaneshwari Jolad, Rajashri Khanai","doi":"10.1007/s10772-023-10019-y","DOIUrl":"https://doi.org/10.1007/s10772-023-10019-y","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"26 1","pages":"287 - 305"},"PeriodicalIF":0.0,"publicationDate":"2023-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44512189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An empirical study on analysis window functions for text-independent speaker recognition 独立文本说话人识别分析窗口函数的实证研究
International Journal of Speech Technology Pub Date : 2023-02-19 DOI: 10.1007/s10772-023-10024-1
Bidhan Barai, N. Das, Subhadip Basu, M. Nasipuri
{"title":"An empirical study on analysis window functions for text-independent speaker recognition","authors":"Bidhan Barai, N. Das, Subhadip Basu, M. Nasipuri","doi":"10.1007/s10772-023-10024-1","DOIUrl":"https://doi.org/10.1007/s10772-023-10024-1","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"26 1","pages":"211-220"},"PeriodicalIF":0.0,"publicationDate":"2023-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41522450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A framework for quality assessment of synthesised speech using learning-based objective evaluation 基于学习的客观评价的合成语音质量评估框架
International Journal of Speech Technology Pub Date : 2023-02-02 DOI: 10.1007/s10772-023-10021-4
Shrikant Malviya, Rohit Mishra, Santosh Kumar Barnwal, U. Tiwary
{"title":"A framework for quality assessment of synthesised speech using learning-based objective evaluation","authors":"Shrikant Malviya, Rohit Mishra, Santosh Kumar Barnwal, U. Tiwary","doi":"10.1007/s10772-023-10021-4","DOIUrl":"https://doi.org/10.1007/s10772-023-10021-4","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"26 1","pages":"221-243"},"PeriodicalIF":0.0,"publicationDate":"2023-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44668147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weibull and Nakagami speech priors based regularized NMF with adaptive wiener filter for speech enhancement 基于Weibull和Nakagami语音先验的正则化NMF自适应维纳滤波器语音增强
International Journal of Speech Technology Pub Date : 2023-02-02 DOI: 10.1007/s10772-023-10020-5
Chaitanya Jannu, S. Vanambathina
{"title":"Weibull and Nakagami speech priors based regularized NMF with adaptive wiener filter for speech enhancement","authors":"Chaitanya Jannu, S. Vanambathina","doi":"10.1007/s10772-023-10020-5","DOIUrl":"https://doi.org/10.1007/s10772-023-10020-5","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"26 1","pages":"197-209"},"PeriodicalIF":0.0,"publicationDate":"2023-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45911020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Speaker identification and localization using shuffled MFCC features and deep learning 使用洗牌MFCC特征和深度学习的说话人识别和定位
International Journal of Speech Technology Pub Date : 2023-01-29 DOI: 10.1007/s10772-023-10023-2
Mahdi Barhoush, Ahmed Hallawa, A. Schmeink
{"title":"Speaker identification and localization using shuffled MFCC features and deep learning","authors":"Mahdi Barhoush, Ahmed Hallawa, A. Schmeink","doi":"10.1007/s10772-023-10023-2","DOIUrl":"https://doi.org/10.1007/s10772-023-10023-2","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"26 1","pages":"185-196"},"PeriodicalIF":0.0,"publicationDate":"2023-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47692961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech 一种融合半径的局部多核学习算法用于语音抑制检测
International Journal of Speech Technology Pub Date : 2023-01-23 DOI: 10.1007/s10772-023-10017-0
Haihua Jiang, Bin Hu, Zhenyu Liu, G. Wang, Lan Zhang
{"title":"A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech","authors":"Haihua Jiang, Bin Hu, Zhenyu Liu, G. Wang, Lan Zhang","doi":"10.1007/s10772-023-10017-0","DOIUrl":"https://doi.org/10.1007/s10772-023-10017-0","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"1 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2023-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44669858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An automated speech analysis system for the detection of cognitive decline in elderly 一种用于检测老年人认知能力下降的自动语音分析系统
International Journal of Speech Technology Pub Date : 2023-01-19 DOI: 10.1007/s10772-023-10016-1
C. Loizou, M. Pantzaris
{"title":"An automated speech analysis system for the detection of cognitive decline in elderly","authors":"C. Loizou, M. Pantzaris","doi":"10.1007/s10772-023-10016-1","DOIUrl":"https://doi.org/10.1007/s10772-023-10016-1","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"1 1","pages":"1-17"},"PeriodicalIF":0.0,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43542889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Plain-to-clear speech video conversion for enhanced intelligibility. 普通到清晰的语音视频转换,提高清晰度。
International Journal of Speech Technology Pub Date : 2023-01-01 DOI: 10.1007/s10772-023-10018-z
Shubam Sachdeva, Haoyao Ruan, Ghassan Hamarneh, Dawn M Behne, Allard Jongman, Joan A Sereno, Yue Wang
{"title":"Plain-to-clear speech video conversion for enhanced intelligibility.","authors":"Shubam Sachdeva,&nbsp;Haoyao Ruan,&nbsp;Ghassan Hamarneh,&nbsp;Dawn M Behne,&nbsp;Allard Jongman,&nbsp;Joan A Sereno,&nbsp;Yue Wang","doi":"10.1007/s10772-023-10018-z","DOIUrl":"https://doi.org/10.1007/s10772-023-10018-z","url":null,"abstract":"<p><p>Clearly articulated speech, relative to plain-style speech, has been shown to improve intelligibility. We examine if visible speech cues in video only can be systematically modified to enhance clear-speech visual features and improve intelligibility. We extract clear-speech visual features of English words varying in vowels produced by multiple male and female talkers. Via a frame-by-frame image-warping based video generation method with a controllable parameter (displacement factor), we apply the extracted clear-speech visual features to videos of plain speech to synthesize clear speech videos. We evaluate the generated videos using a robust, state of the art AI Lip Reader as well as human intelligibility testing. The contributions of this study are: (1) we successfully extract relevant visual cues for video modifications across speech styles, and have achieved enhanced intelligibility for AI; (2) this work suggests that universal talker-independent clear-speech features may be utilized to modify any talker's visual speech style; (3) we introduce \"displacement factor\" as a way of systematically scaling the magnitude of displacement modifications between speech styles; and (4) the high definition generated videos make them ideal candidates for human-centric intelligibility and perceptual training studies.</p>","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"26 1","pages":"163-184"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10042924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9611085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Retraction Note: Nonlinear acoustic noise cancellation based automatic speech recognition system (NANC-ASR) with convolutional neural networks 注:基于卷积神经网络的非线性声学噪声消除自动语音识别系统(nancasr)
International Journal of Speech Technology Pub Date : 2022-12-01 DOI: 10.1007/s10772-022-09991-8
R. Ramadan, Kusum Yadav
{"title":"Retraction Note: Nonlinear acoustic noise cancellation based automatic speech recognition system (NANC-ASR) with convolutional neural networks","authors":"R. Ramadan, Kusum Yadav","doi":"10.1007/s10772-022-09991-8","DOIUrl":"https://doi.org/10.1007/s10772-022-09991-8","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"25 1","pages":"49"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43764472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Retraction Note: Preserving learnability and intelligibility at the point of care with assimilation of different speech recognition techniques 撤回注:保留学习性和可理解性,在点与不同的语音识别技术的同化
International Journal of Speech Technology Pub Date : 2022-12-01 DOI: 10.1007/s10772-022-09996-3
Sukumar Rajendran, P. Jayagopal
{"title":"Retraction Note: Preserving learnability and intelligibility at the point of care with assimilation of different speech recognition techniques","authors":"Sukumar Rajendran, P. Jayagopal","doi":"10.1007/s10772-022-09996-3","DOIUrl":"https://doi.org/10.1007/s10772-022-09996-3","url":null,"abstract":"","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"25 1","pages":"39"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"52286102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信