International Journal of Speech Technology最新文献_第8页

An approach for speech enhancement with dysarthric speech recognition using optimization based machine learning frameworks 一种基于优化的机器学习框架的语音识别增强方法

International Journal of Speech Technology Pub Date : 2023-02-21 DOI: 10.1007/s10772-023-10019-y

Bhuvaneshwari Jolad, Rajashri Khanai

引用次数: 2

An empirical study on analysis window functions for text-independent speaker recognition 独立文本说话人识别分析窗口函数的实证研究

International Journal of Speech Technology Pub Date : 2023-02-19 DOI: 10.1007/s10772-023-10024-1

Bidhan Barai, N. Das, Subhadip Basu, M. Nasipuri

引用次数: 0

A framework for quality assessment of synthesised speech using learning-based objective evaluation 基于学习的客观评价的合成语音质量评估框架

International Journal of Speech Technology Pub Date : 2023-02-02 DOI: 10.1007/s10772-023-10021-4

Shrikant Malviya, Rohit Mishra, Santosh Kumar Barnwal, U. Tiwary

引用次数: 0

Weibull and Nakagami speech priors based regularized NMF with adaptive wiener filter for speech enhancement 基于Weibull和Nakagami语音先验的正则化NMF自适应维纳滤波器语音增强

International Journal of Speech Technology Pub Date : 2023-02-02 DOI: 10.1007/s10772-023-10020-5

Chaitanya Jannu, S. Vanambathina

引用次数: 3

Speaker identification and localization using shuffled MFCC features and deep learning 使用洗牌MFCC特征和深度学习的说话人识别和定位

International Journal of Speech Technology Pub Date : 2023-01-29 DOI: 10.1007/s10772-023-10023-2

Mahdi Barhoush, Ahmed Hallawa, A. Schmeink

引用次数: 2

A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech 一种融合半径的局部多核学习算法用于语音抑制检测

International Journal of Speech Technology Pub Date : 2023-01-23 DOI: 10.1007/s10772-023-10017-0

Haihua Jiang, Bin Hu, Zhenyu Liu, G. Wang, Lan Zhang

引用次数: 0

An automated speech analysis system for the detection of cognitive decline in elderly 一种用于检测老年人认知能力下降的自动语音分析系统

International Journal of Speech Technology Pub Date : 2023-01-19 DOI: 10.1007/s10772-023-10016-1

C. Loizou, M. Pantzaris

引用次数: 0

Plain-to-clear speech video conversion for enhanced intelligibility. 普通到清晰的语音视频转换，提高清晰度。

International Journal of Speech Technology Pub Date : 2023-01-01 DOI: 10.1007/s10772-023-10018-z

Shubam Sachdeva, Haoyao Ruan, Ghassan Hamarneh, Dawn M Behne, Allard Jongman, Joan A Sereno, Yue Wang

{"title":"Plain-to-clear speech video conversion for enhanced intelligibility.","authors":"Shubam Sachdeva, Haoyao Ruan, Ghassan Hamarneh, Dawn M Behne, Allard Jongman, Joan A Sereno, Yue Wang","doi":"10.1007/s10772-023-10018-z","DOIUrl":"https://doi.org/10.1007/s10772-023-10018-z","url":null,"abstract":"<p><p>Clearly articulated speech, relative to plain-style speech, has been shown to improve intelligibility. We examine if visible speech cues in video only can be systematically modified to enhance clear-speech visual features and improve intelligibility. We extract clear-speech visual features of English words varying in vowels produced by multiple male and female talkers. Via a frame-by-frame image-warping based video generation method with a controllable parameter (displacement factor), we apply the extracted clear-speech visual features to videos of plain speech to synthesize clear speech videos. We evaluate the generated videos using a robust, state of the art AI Lip Reader as well as human intelligibility testing. The contributions of this study are: (1) we successfully extract relevant visual cues for video modifications across speech styles, and have achieved enhanced intelligibility for AI; (2) this work suggests that universal talker-independent clear-speech features may be utilized to modify any talker's visual speech style; (3) we introduce \"displacement factor\" as a way of systematically scaling the magnitude of displacement modifications between speech styles; and (4) the high definition generated videos make them ideal candidates for human-centric intelligibility and perceptual training studies.</p>","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"26 1","pages":"163-184"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10042924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9611085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Retraction Note: Nonlinear acoustic noise cancellation based automatic speech recognition system (NANC-ASR) with convolutional neural networks 注:基于卷积神经网络的非线性声学噪声消除自动语音识别系统(nancasr)

International Journal of Speech Technology Pub Date : 2022-12-01 DOI: 10.1007/s10772-022-09991-8

R. Ramadan, Kusum Yadav

引用次数: 0

Retraction Note: Preserving learnability and intelligibility at the point of care with assimilation of different speech recognition techniques 撤回注:保留学习性和可理解性，在点与不同的语音识别技术的同化

International Journal of Speech Technology Pub Date : 2022-12-01 DOI: 10.1007/s10772-022-09996-3

Sukumar Rajendran, P. Jayagopal

引用次数: 0