International Journal of Speech Technology最新文献_第6页

Assessing American presidential candidates using principles of ontological engineering, word sense disambiguation, data envelope analysis and qualitative comparative analysis 运用本体工程学、词义消歧、数据包络分析和定性比较分析的原理对美国总统候选人进行评估

International Journal of Speech Technology Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10043-y

James A. Rodger, Justin Piper

引用次数: 0

Monaural speech separation using WT-Conv-TasNet for hearing aids 使用wt - convt - tasnet进行助听器单耳语音分离

International Journal of Speech Technology Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10045-w

Jharna Agrawal, Manish Gupta, Hitendra Garg

引用次数: 0

Time frequency domain deep CNN for automatic background classification in speech signals 基于时频域深度CNN的语音信号背景自动分类

International Journal of Speech Technology Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10042-z

Rakesh Reddy Yakkati, Sreenivasa Reddy Yeduri, Rajesh Kumar Tripathy, Linga Reddy Cenkeramaddi

{"title":"Time frequency domain deep CNN for automatic background classification in speech signals","authors":"Rakesh Reddy Yakkati, Sreenivasa Reddy Yeduri, Rajesh Kumar Tripathy, Linga Reddy Cenkeramaddi","doi":"10.1007/s10772-023-10042-z","DOIUrl":"https://doi.org/10.1007/s10772-023-10042-z","url":null,"abstract":"Abstract Many application areas, such as background identification, predictive maintenance in industrial applications, smart home applications, assisting deaf people with their daily activities and indexing and retrieval of content-based multimedia, etc., use automatic background classification using speech signals. It is challenging to predict the background environment accurately from speech signal information. Thus, a novel synchrosqueezed wavelet transform (SWT)-based deep learning (DL) approach is proposed in this paper for automatically classifying background information embedded in speech signals. Here, SWT is incorporated to obtain the time-frequency plot from the speech signals. These time-frequency signals are then fed to a deep convolutional neural network (DCNN) to classify background information embedded in speech signals. The proposed DCNN model consists of three convolution layers, one batch-normalization layer, three max-pooling layers, one dropout layer, and one fully connected layer. The proposed method is tested using various background signals embedded in speech signals, such as airport, airplane, drone, street, babble, car, helicopter, exhibition, station, restaurant, and train sounds. According to the results, the proposed SWT-based DCNN approach has an overall classification accuracy of 97.96 (± 0.53)% to classify background information embedded in speech signals. Finally, the performance of the proposed approach is compared to the existing methods.","PeriodicalId":14305,"journal":{"name":"International Journal of Speech Technology","volume":"2641 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135346603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Voice user interfaces in manufacturing logistics: a literature review 制造业物流中的语音用户界面：文献综述

International Journal of Speech Technology Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10036-x

Heiner Ludwig, Thorsten Schmidt, Mathias Kühn

引用次数: 0

Robust automatic accent identification based on the acoustic evidence 基于声学证据的鲁棒自动重音识别

International Journal of Speech Technology Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10031-2

Eiman Alsharhan, Allan Ramsay

引用次数: 0

Using combined features to improve speaker verification in the face of limited reverberant data 在混响数据有限的情况下，利用组合特征改进说话人验证

International Journal of Speech Technology Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10048-7

Khamis A. Al-Karawi, Duraid Y. Mohammed

引用次数: 0

Binary classifier for identification of stammering instances in Hindi speech data 印地语语音数据中口吃实例识别的二元分类器

International Journal of Speech Technology Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10046-9

Shivam Dwivedi, Sanjukta Ghosh, Satyam Dwivedi

引用次数: 0

An automatic speech recognition system for isolated Amazigh word using 1D & 2D CNN-LSTM architecture 基于1D的孤立Amazigh词自动语音识别系统2D CNN-LSTM架构

International Journal of Speech Technology Pub Date : 2023-09-01 DOI: 10.1007/s10772-023-10054-9

Mohamed Daouad, Fadoua Ataa Allah, El Wardani Dadi

引用次数: 0

Speaker and gender dependencies in within/cross linguistic Speech Emotion Recognition 语内/跨语言语音情感识别中的说话人与性别依赖

International Journal of Speech Technology Pub Date : 2023-08-25 DOI: 10.1007/s10772-023-10038-9

Adil Chakhtouna, Sara Sekkate, A. Adib

引用次数: 0

An efficient speech emotion recognition based on a dual-stream CNN-transformer fusion network 基于双流CNN变换器融合网络的高效语音情感识别

International Journal of Speech Technology Pub Date : 2023-07-01 DOI: 10.1007/s10772-023-10035-y

Mohammed Tellai, L. Gao, Qi-rong Mao

引用次数: 0