2022 IEEE Spoken Language Technology Workshop (SLT)最新文献

Distilling Sequence-to-Sequence Voice Conversion Models for Streaming Conversion Applications 为流转换应用提取序列到序列的语音转换模型

2022 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2023-01-09 DOI: 10.1109/SLT54892.2023.10023432

Kou Tanaka, H. Kameoka, Takuhiro Kaneko, Shogo Seki

引用次数: 2

Streaming Bilingual End-to-End ASR Model Using Attention Over Multiple Softmax 基于多Softmax关注的流双语端到端ASR模型

2022 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2023-01-09 DOI: 10.1109/SLT54892.2023.10022475

Aditya Patil, Vikas Joshi, Purvi Agrawal, Rupeshkumar Mehta

引用次数: 0

Code-Switched Language Modelling Using a Code Predictive Lstm in Under-Resourced South African Languages 在资源不足的南非语言中使用代码预测Lstm的代码切换语言建模

2022 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2023-01-09 DOI: 10.1109/SLT54892.2023.10022517

Joshua Jansen van Vüren, T. Niesler

引用次数: 3

NAM+: Towards Scalable End-to-End Contextual Biasing for Adaptive ASR 面向自适应ASR的可扩展端到端上下文偏置

2022 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2023-01-09 DOI: 10.1109/SLT54892.2023.10023323

Tsendsuren Munkhdalai, Zelin Wu, G. Pundak, K. Sim, Jiayang Li, Pat Rondon, Tara N. Sainath

引用次数: 7

Cover Page 封面页

2022 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2023-01-09 DOI: 10.1109/slt54892.2023.10022896

引用次数: 0

Hackathon

2022 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2023-01-09 DOI: 10.1109/slt54892.2023.10023077

DO Objetivo

引用次数: 2

Response Timing Estimation for Spoken Dialog Systems Based on Syntactic Completeness Prediction 基于句法完整性预测的口语对话系统响应时间估计

2022 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2023-01-09 DOI: 10.1109/SLT54892.2023.10023458

Jin Sakuma, S. Fujie, Tetsunori Kobayashi

引用次数: 6

Peppanet: Effective Mispronunciation Detection and Diagnosis Leveraging Phonetic, Phonological, and Acoustic Cues Peppanet:利用语音、语音和声学线索有效的错误发音检测和诊断

2022 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2023-01-09 DOI: 10.1109/SLT54892.2023.10022472

Bi-Cheng Yan, Hsin-Wei Wang, Berlin Chen

{"title":"Peppanet: Effective Mispronunciation Detection and Diagnosis Leveraging Phonetic, Phonological, and Acoustic Cues","authors":"Bi-Cheng Yan, Hsin-Wei Wang, Berlin Chen","doi":"10.1109/SLT54892.2023.10022472","DOIUrl":"https://doi.org/10.1109/SLT54892.2023.10022472","url":null,"abstract":"Mispronunciation detection and diagnosis (MDD) aims to detect erroneous pronunciation segments in an L2 learner's articulation and subsequently provide informative diagnostic feedback. Most existing neural methods follow a dictation-based modeling paradigm that finds out pronunciation errors and returns diagnostic feedback at the same time by aligning the recognized phone sequence uttered by an L2 learner to the corresponding canonical phone sequence of a given text prompt. However, the main downside of these methods is that the dictation process and alignment process are mostly made independent of each other. In view of this, we present a novel end-to-end neural method, dubbed PeppaNet, building on a unified structure that can jointly model the dictation process and the alignment process. The model of our method learns to directly predict the pronunciation correctness of each canonical phone of the text prompt and in turn provides its corresponding diagnostic feedback. In contrast to the conventional dictation-based methods that rely mainly on a free-phone recognition process, PeppaNet makes good use of an effective selective gating mechanism to simultaneously incorporate phonetic, phonological and acoustic cues to generate corrections that are more proper and phonetically related to the canonical pronunciations. Extensive sets of experiments conducted on the L2-ARCTIC benchmark dataset seem to show the merits of our proposed method in comparison to some recent top-of-the-line methods.","PeriodicalId":352002,"journal":{"name":"2022 IEEE Spoken Language Technology Workshop (SLT)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125891553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Welcome Page 欢迎页面

2022 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2023-01-09 DOI: 10.1109/slt54892.2023.10022398

D. Glaser

引用次数: 0

Flickering Reduction with Partial Hypothesis Reranking for Streaming ASR 基于部分假设重排序的流ASR闪烁抑制

2022 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2023-01-09 DOI: 10.1109/SLT54892.2023.10023016

A. Bruguier, David Qiu, Trevor Strohman, Yanzhang He

引用次数: 1