Proceedings : ICSLP. International Conference on Spoken Language Processing最新文献

筛选
英文 中文
Audiovisual integration of speech by children and adults with cochlear implants 植入人工耳蜗的儿童和成人语音的视听整合
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 2002-09-16 DOI: 10.21437/ICSLP.2002-427
K. Kirk, D. Pisoni, Lorin Lachs
{"title":"Audiovisual integration of speech by children and adults with cochlear implants","authors":"K. Kirk, D. Pisoni, Lorin Lachs","doi":"10.21437/ICSLP.2002-427","DOIUrl":"https://doi.org/10.21437/ICSLP.2002-427","url":null,"abstract":"The present study examined how prelingually deafened children and postlingually deafened adults with cochlear implants (CIs) combine visual speech information with auditory cues. Performance was assessed under auditory-alone (A), visual- alone (V), and combined audiovisual (AV) presentation formats. A measure of visual enhancement, RA, was used to assess the gain in performance provided in the AV condition relative to the maximum possible performance in the auditory-alone format. Word recogniton was highest for AV presentation followed by A and V, respectively. Children who received more visual enhancement also produced more intelligible speech. Adults with CIs made better use of visual information in more difficult listening conditions (e.g., when mutiple talkers or phonemically similar words were used). The findings are discussed in terms of the complementary nature of auditory and visual sources of information that specify the same underlying gestures and articulatory events in speech.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"8 1","pages":"1689-1692"},"PeriodicalIF":0.0,"publicationDate":"2002-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79175257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
AUDIOVISUAL INTEGRATION OF SPEECH BY CHILDREN AND ADULTS WITH COCHEAR IMPLANTS. 儿童和成人的语音视听整合。
Karen Iler Kirk, David B Pisoni, Lorin Lachs
{"title":"AUDIOVISUAL INTEGRATION OF SPEECH BY CHILDREN AND ADULTS WITH COCHEAR IMPLANTS.","authors":"Karen Iler Kirk, David B Pisoni, Lorin Lachs","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The present study examined how prelingually deafened children and postlingually deafened adults with cochlear implants (CIs) combine visual speech information with auditory cues. Performance was assessed under auditory-alone (A), visual- alone (V), and combined audiovisual (AV) presentation formats. A measure of visual enhancement, R<sub>A</sub>, was used to assess the gain in performance provided in the AV condition relative to the maximum possible performance in the auditory-alone format. Word recogniton was highest for AV presentation followed by A and V, respectively. Children who received more visual enhancement also produced more intelligible speech. Adults with CIs made better use of visual information in more difficult listening conditions (e.g., when mutiple talkers or phonemically similar words were used). The findings are discussed in terms of the complementary nature of auditory and visual sources of information that specify the same underlying gestures and articulatory events in speech.</p>","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"2002 ","pages":"1689-1692"},"PeriodicalIF":0.0,"publicationDate":"2002-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4214155/pdf/nihms410773.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32786798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SABLE: a standard for TTS markup SABLE: TTS标记的标准
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-14
R. Sproat, A. Hunt, Mari Ostendorf, P. Taylor, A. Black, K. Lenzo, M. Edgington
{"title":"SABLE: a standard for TTS markup","authors":"R. Sproat, A. Hunt, Mari Ostendorf, P. Taylor, A. Black, K. Lenzo, M. Edgington","doi":"10.21437/ICSLP.1998-14","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-14","url":null,"abstract":"Currently, speech synthesizers are controlled by a multitude of proprietary tag sets. These tag sets vary substantially across synthesizers and are an inhibitor to the adoption of speech synthesis technology by developers. SABLE is an XML/SGML-based markup scheme for text-to-speech synthesis, developed to address the need for a common TTS control paradigm. This paper presents an overview of the SABLE specification, and provides links to sites where further information on SABLE can be accessed.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"38 1","pages":"27-30"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76325810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Efficient adaptation of TTS duration model to new speakers TTS持续时间模型对新说话者的有效自适应
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-5
Chilin Shih, Wentao Gu, J. V. Santen
{"title":"Efficient adaptation of TTS duration model to new speakers","authors":"Chilin Shih, Wentao Gu, J. V. Santen","doi":"10.21437/ICSLP.1998-5","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-5","url":null,"abstract":"This paper discusses a methodology using a minimal set of sentences to adapt an existing TTS duration model to capture interspeaker variations. The assumption is that the original duration database contains information of both language-specific and speaker-specific duration characteristics. In training a duration model for a new speaker, only the speaker-specific information needs to be modeled, therefore the size of the training data can be reduced drastically. Results from several experiments are compared and discussed.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"151 1","pages":"105-110"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75407984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A three-dimensional linear articulatory model based on MRI data 基于MRI数据的三维线性关节模型
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-353
P. Badin, G. Bailly, M. Raybaudi, C. Segebarth
{"title":"A three-dimensional linear articulatory model based on MRI data","authors":"P. Badin, G. Bailly, M. Raybaudi, C. Segebarth","doi":"10.21437/ICSLP.1998-353","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-353","url":null,"abstract":"Based on a set of 3D vocal tract images obtained by MRI, a 3D statistical articulatory model has been built using guided Principal Component Analysis. It constitutes an extension to the lateral dimension of the mid-sagittal model previously developed from a radiofilm recorded on the same subject. The parameters of the 2D model have been found to be good predictors of the 3D shapes, for most configurations. A first evaluation of the model in terms of area functions and formants is presented.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"32 1","pages":"249-254"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90477926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Global optimisation of neural network models via sequential sampling-importance resampling 神经网络模型的序贯抽样-重要性重抽样全局优化
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1998-09-16 DOI: 10.21437/ICSLP.1998-412
J. F. G. D. Freitas, S. E. Johnson, M. Niranjan, A. Gee
{"title":"Global optimisation of neural network models via sequential sampling-importance resampling","authors":"J. F. G. D. Freitas, S. E. Johnson, M. Niranjan, A. Gee","doi":"10.21437/ICSLP.1998-412","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-412","url":null,"abstract":"We propose a novel strategy for training neural networks using sequential Monte Carlo algorithms. This global optimisation strategy allows us to learn the probability distribution of the network weights in a sequential framework. It is well suited to applications involving on-line, nonlinear or non-stationary signal processing. We show how the new algorithms can outperform extended Kalman filter (EKF) training.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"58 1","pages":"410-416"},"PeriodicalIF":0.0,"publicationDate":"1998-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82063926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Word boundary detection using pitch variations 使用音高变化的词边界检测
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-211
V. R. Gadde, J. Srichand
{"title":"Word boundary detection using pitch variations","authors":"V. R. Gadde, J. Srichand","doi":"10.21437/ICSLP.1996-211","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-211","url":null,"abstract":"This paper proposes a method for detecting word boundaries. This method is based on the behaviour of the pitch frequency across the sentences. The pitch frequency (F 0 ) is found to rise in a word and fall to the next word. The presence of this fall is proposed as a means of detecting word boundaries. Four major Indian languages are used and the results show that nearly 85% of the word boundaries were correctly detected. The same method used for German language shows that nearly 65% of the word boundaries were correctly detected. The implicationsof these result in the development of a continuous speech recognition system are discussed.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"29 1 1","pages":"813-816"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78012504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch 声学失配电话语音的文本无关说话人识别方法比较
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-454
S. Vuuren
{"title":"Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch","authors":"S. Vuuren","doi":"10.21437/ICSLP.1996-454","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-454","url":null,"abstract":"We compare speaker recognition performance of vector quantization (VQ), Gaussian mixture modeling (GMM) and the Arithmetic Harmonic Sphericity measure (AHS) in adverse telephone speech conditions. The aim is to address the question: how do multimodal VQ and GMM typically compare to the simpler unimodal AHS for matched and mismatched training and testing environments? We study identification (closed set) and verification errors on a new multi environment database. We consider LPC and PLP features as well as their RASTA derivatives. We conclude that RASTA processing can remove redundancies from the features. We affirm that even when we use channel and noise compensation schemes, speaker recognition errors remain high when there is acoustic mismatch.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"25 1","pages":"1788-1791"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83066549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
Distinction between 'normal' focus and 'contrastive/emphatic' focus 区分“正常”焦点和“对比/强调”焦点
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-162
A. Elsner
{"title":"Distinction between 'normal' focus and 'contrastive/emphatic' focus","authors":"A. Elsner","doi":"10.21437/ICSLP.1996-162","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-162","url":null,"abstract":"","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"6 1","pages":"642-645"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74944446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Does lexical stress or metrical stress better predict word boundaries in Dutch? 在荷兰语中,词汇重音和韵律重音哪个能更好地预测单词边界?
Proceedings : ICSLP. International Conference on Spoken Language Processing Pub Date : 1996-10-03 DOI: 10.21437/ICSLP.1996-407
D. V. Kuijk
{"title":"Does lexical stress or metrical stress better predict word boundaries in Dutch?","authors":"D. V. Kuijk","doi":"10.21437/ICSLP.1996-407","DOIUrl":"https://doi.org/10.21437/ICSLP.1996-407","url":null,"abstract":"For both human and automatic speech recognizers, it is difficult to segment continuous speech into discrete units such as words. Word segmentation is so hard because there seem to be no self-evident cues for word boundaries in the speech stream. However, it has been suggested that English listeners can profit from the occurrence of full vowels (i.e. vowels with metrical stress) in the speech stream to make a first good guess about the location of word boundaries. The CELEX database study described in this paper investigates whether such a strategy is also feasible for Dutch, and whether the occurrence of full vowels or the occurrence of vowels with primary word stress (i.e. vowels with lexical stress) is a better cue for word boundaries. The CELEX counts suggest that, for Dutch, metrical stress seems to be a better predictor of word boundaries than lexical stress.","PeriodicalId":90685,"journal":{"name":"Proceedings : ICSLP. International Conference on Spoken Language Processing","volume":"7 1","pages":"1585-1588"},"PeriodicalIF":0.0,"publicationDate":"1996-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88486598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信