Proceedings : ICSLP. International Conference on Spoken Language Processing最新文献

筛选
英文 中文
An improved vector quantization algorithm for speech transmission over noisy channels 噪声信道上语音传输的改进矢量量化算法
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-100
G. Cawley
{"title":"An improved vector quantization algorithm for speech transmission over noisy channels","authors":"G. Cawley","doi":"10.21437/ICSLP.1996-100","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-100","url":null,"abstract":"","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"9 1","pages":"299-301"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83478967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Pausing strategies in discourse in dutch 荷兰语语篇中的停顿策略
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-271
M. E. V. Donzel, F. J. K. Beinum
{"title":"Pausing strategies in discourse in dutch","authors":"M. E. V. Donzel, F. J. K. Beinum","doi":"10.21437/ICSLP.1996-271","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-271","url":null,"abstract":"The paper describes an experiment in which the different pausing strategies in discourse in Dutch were investigated. Spontaneous discourses were recorded from four male and four female native Dutch speakers. Silent and filled pauses were located in the speech signal, as well as lengthened words. These were subsequently related to different discourse structures, obtained independently from prosodic features. Results show that there are basically three different types of pausing: silent pauses, filled pauses, and lengthening of words. Speakers apply these means in different ways to achieve pausing, by using one specific pause type or a combination of more than one. The way of applying pausing is rather uniform within one speaker, whereas the choice of a particular strategy is largely speaker dependent.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"76 1","pages":"1029-1032"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83858745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Does lexical stress or metrical stress better predict word boundaries in Dutch? 在荷兰语中,词汇重音和韵律重音哪个能更好地预测单词边界?
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-407
D. V. Kuijk
{"title":"Does lexical stress or metrical stress better predict word boundaries in Dutch?","authors":"D. V. Kuijk","doi":"10.21437/ICSLP.1996-407","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-407","url":null,"abstract":"For both human and automatic speech recognizers, it is difficult to segment continuous speech into discrete units such as words. Word segmentation is so hard because there seem to be no self-evident cues for word boundaries in the speech stream. However, it has been suggested that English listeners can profit from the occurrence of full vowels (i.e. vowels with metrical stress) in the speech stream to make a first good guess about the location of word boundaries. The CELEX database study described in this paper investigates whether such a strategy is also feasible for Dutch, and whether the occurrence of full vowels or the occurrence of vowels with primary word stress (i.e. vowels with lexical stress) is a better cue for word boundaries. The CELEX counts suggest that, for Dutch, metrical stress seems to be a better predictor of word boundaries than lexical stress.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"7 1","pages":"1585-1588"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88486598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Generating F0 contours from toBI labels using linear regression 使用线性回归从toBI标签生成F0轮廓
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-354
A. Black, A. Hunt
{"title":"Generating F0 contours from toBI labels using linear regression","authors":"A. Black, A. Hunt","doi":"10.21437/ICSLP.1996-354","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-354","url":null,"abstract":"","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"6 1","pages":"1385-1388"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83381033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 70
The multi-lag-window method for robust extended-range F0 determination 鲁棒扩展范围F0测定的多滞后窗法
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-572
E. Geoffrois
{"title":"The multi-lag-window method for robust extended-range F0 determination","authors":"E. Geoffrois","doi":"10.21437/ICSLP.1996-572","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-572","url":null,"abstract":"This paper addresses the problem of the fundamental frequency (F 0 ) determination of a speech signal, and proposes four improvements to conventional frequency-domain methods. The major improvement is a multi-scale analysis which extends the range of F 0 that can be correctly processed. It builds on the lag-window method proposed by Sagayama (1978), hence the name “multi-lag-window”. Secondly, a modification of the lag-window method itself improves its robustness to periodic noises (while loosing its gain-independence property). Thirdly, a rescaling is introduced to per-mit a full Dynamic Programming search for the optimal F 0 curve. Finally, a mathematically justified peak interpolation is proposed for replacing the conventional, inaccurate parabolic interpolation. These four improvements result in an accurate, robust, extended-range F 0 determination method, which was tested on spontaneous speechfrom 20 speakers,ranging from less than 50 Hz to more than 600 Hz.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"16 1","pages":"2239-2242"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87127519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
H-infinity filtering for speech enhancement 用于语音增强的h -∞滤波
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-226
Xuemin Shen, Li Deng, Anisa Yasmin
{"title":"H-infinity filtering for speech enhancement","authors":"Xuemin Shen, Li Deng, Anisa Yasmin","doi":"10.21437/ICSLP.1996-226","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-226","url":null,"abstract":"","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"23 1","pages":"873-876"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88552500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An adaptive-beam pruning technique for continuous speech recognition 连续语音识别的自适应波束修剪技术
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-528
H. V. hamme, Filip Van Aelten
{"title":"An adaptive-beam pruning technique for continuous speech recognition","authors":"H. V. hamme, Filip Van Aelten","doi":"10.21437/ICSLP.1996-528","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-528","url":null,"abstract":"Pruning is an essential paradigm to build HMM based large vocabulary speech recognisers that use reasonable computing resources. Unlikely sentence, word or subword hypotheses are removed from the search space when their likelihood falls outside a beam relative to the best scoring hypothesis. A method for automatically steering this beam such that the search space attains a predefined size is presented.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"57 1","pages":"2083-2086"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86033820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Robust F0 and jitter estimation in pathological voices 病理语音的鲁棒F0和抖动估计
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-188
M. Vieira, F. McInnes, M. Jack
{"title":"Robust F0 and jitter estimation in pathological voices","authors":"M. Vieira, F. McInnes, M. Jack","doi":"10.21437/ICSLP.1996-188","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-188","url":null,"abstract":"","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"40 1","pages":"745-748"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82623406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
An architecture for spoken dialogue management 用于口语对话管理的体系结构
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-270
D. Duff, B. Gates, S. Luperfoy
{"title":"An architecture for spoken dialogue management","authors":"D. Duff, B. Gates, S. Luperfoy","doi":"10.21437/ICSLP.1996-270","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-270","url":null,"abstract":"We propose an architecture for integrating discourse processing and speech recognition (SR) in spoken dialogue systems. It was first developed for computer-mediated bilingual dialogue in voiceto-voice machine translation applications and we apply it here to a distributed battlefield simulation system used for military training. According to this architecture discourse functions previously distributed through the interface code are collected into a centralized discourse capability. The Dialogue Manager (DM) acts as a third-party mediator overseeing the translation of input and output utterances between English and the command language of the backend system. The DM calls the Discourse Processor (DP) to update the context representation each time an utterance is issued or when a salient non-linguistic event occurs in the simulation. The DM is responsible for managing the interaction among components of the interface system and the user. For task-based human-computer dialogue systems it consults three sources of nonlinguistic context constraint in addition to the linguistic Discourse State: (1) a User Model, (2) a static Domain Model containing rules for engaging the backend system, with a grammar for the language of well-formed, executable commands, and (3) a dynamic Backend Model (BEM) that maintains updated status for salient aspects of the non-linguistic context. In this paper we describe its four-step recovery algorithm invoked by DM whenever an item is unclear in the current context, or when an interpretation error is, and show how parameter settings on the algorithm can modify the overall behavior of the system from Tutor to Trainer. This is offered to illustrate how limited (inexpensive) dialogue processing functionality, judiciously selected, and designed in conjunction with expectations for human dialogue behavior can compensate for inevitable limitations in SR, NL processor, the backend software application, or even in the user’s understanding of the task or the software system. 1. SPOKEN DIALOGUE SYSTEMS 1.1 Integrating Discourse and SR Waibel et al., (1989) and De Mori et al., (1988) extend stochastic language modeling techniques to the discourse level to improve spoken dialogue systems. The complexity of discourse state descriptions leads to a sparse data problem during training, and idiosyncratic human behavior at run time can defeat even the best probabilistic dialogue model. Symbolic approaches to spoken discourse data identify discourse constraints on language model selection at run time. Our work collects discourse-level processing into a centralized discourse capability as part of a modular user interface dialogue architecture. Its use in a spoken dialogue interface to a distributed battlefield simulation system used for military training is diagrammed in Figure 1.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"69 1","pages":"1025-1028"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73827148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Using multi-level segmentation coefficients to improve HMM speech recognition 利用多级分割系数改进HMM语音识别
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-81
K. Hübener
{"title":"Using multi-level segmentation coefficients to improve HMM speech recognition","authors":"K. Hübener","doi":"10.21437/ICSLP.1996-81","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-81","url":null,"abstract":"","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"3 1","pages":"248-251"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75631272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信