Proceedings : ICSLP. International Conference on Spoken Language Processing最新文献

筛选
英文 中文
Pausing strategies in discourse in dutch 荷兰语语篇中的停顿策略
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-271
M. E. V. Donzel, F. J. K. Beinum
{"title":"Pausing strategies in discourse in dutch","authors":"M. E. V. Donzel, F. J. K. Beinum","doi":"10.21437/ICSLP.1996-271","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-271","url":null,"abstract":"The paper describes an experiment in which the different pausing strategies in discourse in Dutch were investigated. Spontaneous discourses were recorded from four male and four female native Dutch speakers. Silent and filled pauses were located in the speech signal, as well as lengthened words. These were subsequently related to different discourse structures, obtained independently from prosodic features. Results show that there are basically three different types of pausing: silent pauses, filled pauses, and lengthening of words. Speakers apply these means in different ways to achieve pausing, by using one specific pause type or a combination of more than one. The way of applying pausing is rather uniform within one speaker, whereas the choice of a particular strategy is largely speaker dependent.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"76 1","pages":"1029-1032"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83858745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Relationship between discourse structure and dynamic speech rate 语篇结构与动态语速的关系
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-438
F. J. K. Beinum, M. E. V. Donzel
{"title":"Relationship between discourse structure and dynamic speech rate","authors":"F. J. K. Beinum, M. E. V. Donzel","doi":"10.21437/ICSLP.1996-438","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-438","url":null,"abstract":"This paper regards one specific element of a larger research project on the acoustic determinants of information structure in spontaneous and read discourse in Dutch. From a previous experiment within that project it turned out that listeners used two main cues (viz. speaking rate and intonation) to differentiate between spontaneous and read speech. The aim of the present experiment is to investigate the role of one of these prosodic cues, i.e., the local variability in speaking rate, and to study the relationship between the information structure of a spoken discourse on the one hand, and dynamic speaking rate measurements of that discourse on the other hand. Results show that there is a large variability in average syllable duration over the various interpausal speech runs for each of the eight speakers. No straightforward relation is found between the number of syllables within a run and the average syllable duration. We hypothesize that, at least in spontaneous speech, variations in speaking rate are related to the (global and/or local) information structures in the discourse. Global analysis of the discourse structure in paragraphs and clauses reveals that for each of the speakers the average syllable duration of the first run of a paragraph is longer than the overall mean value per speaker in more than 60% of the cases. Inspection of the quartiles of runs with highest ASD-values and those with lowest ASD-values for each of the speakers shows quite different structures, which can be explained on the basis of partly local and partly global discourse characteristics.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"3 1","pages":"1724-1727"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74040501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
An improved vector quantization algorithm for speech transmission over noisy channels 噪声信道上语音传输的改进矢量量化算法
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-100
G. Cawley
{"title":"An improved vector quantization algorithm for speech transmission over noisy channels","authors":"G. Cawley","doi":"10.21437/ICSLP.1996-100","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-100","url":null,"abstract":"","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"9 1","pages":"299-301"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83478967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Generating F0 contours from toBI labels using linear regression 使用线性回归从toBI标签生成F0轮廓
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-354
A. Black, A. Hunt
{"title":"Generating F0 contours from toBI labels using linear regression","authors":"A. Black, A. Hunt","doi":"10.21437/ICSLP.1996-354","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-354","url":null,"abstract":"","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"6 1","pages":"1385-1388"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83381033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 70
The multi-lag-window method for robust extended-range F0 determination 鲁棒扩展范围F0测定的多滞后窗法
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-572
E. Geoffrois
{"title":"The multi-lag-window method for robust extended-range F0 determination","authors":"E. Geoffrois","doi":"10.21437/ICSLP.1996-572","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-572","url":null,"abstract":"This paper addresses the problem of the fundamental frequency (F 0 ) determination of a speech signal, and proposes four improvements to conventional frequency-domain methods. The major improvement is a multi-scale analysis which extends the range of F 0 that can be correctly processed. It builds on the lag-window method proposed by Sagayama (1978), hence the name “multi-lag-window”. Secondly, a modification of the lag-window method itself improves its robustness to periodic noises (while loosing its gain-independence property). Thirdly, a rescaling is introduced to per-mit a full Dynamic Programming search for the optimal F 0 curve. Finally, a mathematically justified peak interpolation is proposed for replacing the conventional, inaccurate parabolic interpolation. These four improvements result in an accurate, robust, extended-range F 0 determination method, which was tested on spontaneous speechfrom 20 speakers,ranging from less than 50 Hz to more than 600 Hz.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"16 1","pages":"2239-2242"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87127519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
H-infinity filtering for speech enhancement 用于语音增强的h -∞滤波
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-226
Xuemin Shen, Li Deng, Anisa Yasmin
{"title":"H-infinity filtering for speech enhancement","authors":"Xuemin Shen, Li Deng, Anisa Yasmin","doi":"10.21437/ICSLP.1996-226","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-226","url":null,"abstract":"","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"23 1","pages":"873-876"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88552500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An adaptive-beam pruning technique for continuous speech recognition 连续语音识别的自适应波束修剪技术
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-528
H. V. hamme, Filip Van Aelten
{"title":"An adaptive-beam pruning technique for continuous speech recognition","authors":"H. V. hamme, Filip Van Aelten","doi":"10.21437/ICSLP.1996-528","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-528","url":null,"abstract":"Pruning is an essential paradigm to build HMM based large vocabulary speech recognisers that use reasonable computing resources. Unlikely sentence, word or subword hypotheses are removed from the search space when their likelihood falls outside a beam relative to the best scoring hypothesis. A method for automatically steering this beam such that the search space attains a predefined size is presented.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"57 1","pages":"2083-2086"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86033820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Robust F0 and jitter estimation in pathological voices 病理语音的鲁棒F0和抖动估计
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-188
M. Vieira, F. McInnes, M. Jack
{"title":"Robust F0 and jitter estimation in pathological voices","authors":"M. Vieira, F. McInnes, M. Jack","doi":"10.21437/ICSLP.1996-188","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-188","url":null,"abstract":"","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"40 1","pages":"745-748"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82623406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
An architecture for spoken dialogue management 用于口语对话管理的体系结构
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-270
D. Duff, B. Gates, S. Luperfoy
{"title":"An architecture for spoken dialogue management","authors":"D. Duff, B. Gates, S. Luperfoy","doi":"10.21437/ICSLP.1996-270","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-270","url":null,"abstract":"We propose an architecture for integrating discourse processing and speech recognition (SR) in spoken dialogue systems. It was first developed for computer-mediated bilingual dialogue in voiceto-voice machine translation applications and we apply it here to a distributed battlefield simulation system used for military training. According to this architecture discourse functions previously distributed through the interface code are collected into a centralized discourse capability. The Dialogue Manager (DM) acts as a third-party mediator overseeing the translation of input and output utterances between English and the command language of the backend system. The DM calls the Discourse Processor (DP) to update the context representation each time an utterance is issued or when a salient non-linguistic event occurs in the simulation. The DM is responsible for managing the interaction among components of the interface system and the user. For task-based human-computer dialogue systems it consults three sources of nonlinguistic context constraint in addition to the linguistic Discourse State: (1) a User Model, (2) a static Domain Model containing rules for engaging the backend system, with a grammar for the language of well-formed, executable commands, and (3) a dynamic Backend Model (BEM) that maintains updated status for salient aspects of the non-linguistic context. In this paper we describe its four-step recovery algorithm invoked by DM whenever an item is unclear in the current context, or when an interpretation error is, and show how parameter settings on the algorithm can modify the overall behavior of the system from Tutor to Trainer. This is offered to illustrate how limited (inexpensive) dialogue processing functionality, judiciously selected, and designed in conjunction with expectations for human dialogue behavior can compensate for inevitable limitations in SR, NL processor, the backend software application, or even in the user’s understanding of the task or the software system. 1. SPOKEN DIALOGUE SYSTEMS 1.1 Integrating Discourse and SR Waibel et al., (1989) and De Mori et al., (1988) extend stochastic language modeling techniques to the discourse level to improve spoken dialogue systems. The complexity of discourse state descriptions leads to a sparse data problem during training, and idiosyncratic human behavior at run time can defeat even the best probabilistic dialogue model. Symbolic approaches to spoken discourse data identify discourse constraints on language model selection at run time. Our work collects discourse-level processing into a centralized discourse capability as part of a modular user interface dialogue architecture. Its use in a spoken dialogue interface to a distributed battlefield simulation system used for military training is diagrammed in Figure 1.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"69 1","pages":"1025-1028"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73827148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Using multi-level segmentation coefficients to improve HMM speech recognition 利用多级分割系数改进HMM语音识别
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-81
K. Hübener
{"title":"Using multi-level segmentation coefficients to improve HMM speech recognition","authors":"K. Hübener","doi":"10.21437/ICSLP.1996-81","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-81","url":null,"abstract":"","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"3 1","pages":"248-251"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75631272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信