5th International Conference on Spoken Language Processing (ICSLP 1998)最新文献

筛选
英文 中文
The process of generation and development of second language Japanese accentuation 第二语言日语重音的产生与发展过程
5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-718
Nobuko Yamada
{"title":"The process of generation and development of second language Japanese accentuation","authors":"Nobuko Yamada","doi":"10.21437/ICSLP.1998-718","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-718","url":null,"abstract":"This study will investigate how non-native speakers of Japanese acquire Japanese accentuation from the viewpoint of the location of the accent nucleus. Hypothetical models for the process of generation and for developmental sequence of interlanguage Japanese accentuation, which is interim accentual system created by learners, will be proposed. The subjects appear to generate their interlanguage as the results of application of strategies or examples of accentuation. Those seem to be discovered from L2 input, or chosen and fetch from their memory. The subjects’ competence of accentuation appear to be developed by L2 input, starting with L1 and universal property. They seem to discover and apply 5 types of strategies toward acquisition of target accentual rules of Japanese.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117003520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A statistical study of pitch target points in five languages 五种语言音准点的统计研究
5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-155
E. Campione, J. Véronis
{"title":"A statistical study of pitch target points in five languages","authors":"E. Campione, J. Véronis","doi":"10.21437/ICSLP.1998-155","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-155","url":null,"abstract":"We present the results of a large-scale statistical study of pitch target points in five languages, on a corpus comprising 4 hours 20 minutes of speech and involving 50 different speakers. The entire corpus has been stylized automatically by a technique reducing the F0 contour to a series of target points representing the significant pitch changes. It was then entirely verified by experts using a resynthesis method, in order to ensure that there was no audible difference with the original. The set of ca. 50000 pitch target points thus obtained was then analyzed from a statistical point of view. In this paper we describe the main results of this study, in terms of frequency distribution of target points, pitch movements and relation of pitch movements to time interval. Our study reveals interesting differences across languages and sex.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117008795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
A schema for illocutionary act identification with prosodic feature 具有韵律特征的言外行为识别图式
5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-138
M. Tamoto, T. Kawabata
{"title":"A schema for illocutionary act identification with prosodic feature","authors":"M. Tamoto, T. Kawabata","doi":"10.21437/ICSLP.1998-138","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-138","url":null,"abstract":"We propose a new discrimination schema for illocutionary acts using prosodic features based on experimental results.We performed a series of experiments in which subjects were asked to identify the sentence type and intonation contour of given stimuli. Given the transcribed sentence with contextual information, the subjects were able to identify correctly the sentence type of 85% of 290 sentences. With information about the intonation contour types, they could correctly identify 90% of speech acts. We find evidence that illocutionary acts can be signaled by specific contour types. These typical contours are realized in the sentence final boundary tone; a neutral or falling tone for assertion and request, a rising tone for question. An intonation contour is then identified using an algorithm that calculates the range and slope of the upper and lower bounds of unwarped segmental contour, and matches these against predefined contour templates. This algorithm could correctly recognize 78% of the pitch contour types in the utterances. Furthermore, this automated intonation contour classification, nearly 90% of speech acts could be correctly identified.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117045980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Real time speaker indexing based on subspace method - application to TV news articles and debate 基于子空间方法的实时说话人索引——在电视新闻文章和辩论中的应用
5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-243
M. Nishida, Y. Ariki
{"title":"Real time speaker indexing based on subspace method - application to TV news articles and debate","authors":"M. Nishida, Y. Ariki","doi":"10.21437/ICSLP.1998-243","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-243","url":null,"abstract":"In this paper, we propose a method to extract and verify in-dividual speaker utterance using a subspace method. This method can extract speech section of the same speaker by repeating speaker verification between the present speech section and the immediately previous speech section. The speaker models are automatically trained in the verification process without constructing speaker templates in advance. As a result, this speaker verification method is applied to speaker indexing. In this study, announcer utterances are automatically extracted from news speech data which in-cludes reporter or interviewer utterances. Also extracted automatically are the utterances of each participator in debate program broadcasted on TV.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117156541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Automatic detection of landmark for nasal consonants from speech waveform 语音波形中鼻辅音标记的自动检测
5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-516
Limin Du, K. Stevens
{"title":"Automatic detection of landmark for nasal consonants from speech waveform","authors":"Limin Du, K. Stevens","doi":"10.21437/ICSLP.1998-516","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-516","url":null,"abstract":"A knowledge-based approach towards automatically detecting nasal landmarks (/m/, /n/, and /ng/) from speech waveform is developed. The acoustic characteristics Fn1 locus calculated on each frame of speech waveform as the mass center of spectrum amplitude in the vicinity of the lowest spectral prominence between 150-1000Hz, and A23 locus calculated on the same speech frame as a band energy between 1000-3000Hz were incorporated together to construct the nasal landmark detector, which alarms at the instants of closure and release of nasal murmur. Experiment observations on the acoustic characteristics of Fn1 and A23 and the nasal consonant landmark detection results on the VCV database are also presented.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115987871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
System-user interaction and response strategy in spoken dialogue system 口语对话系统中的系统-用户交互及响应策略
5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-73
Y. Okato, Keiji Kato, Mikio Yamamoto, S. Itahashi
{"title":"System-user interaction and response strategy in spoken dialogue system","authors":"Y. Okato, Keiji Kato, Mikio Yamamoto, S. Itahashi","doi":"10.21437/ICSLP.1998-73","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-73","url":null,"abstract":"There are a number of restrictions in human-machine interactions, which continue to warrant a better con-trol of response utterances in spoken dialogue systems. Indeed, the human user often has to deal with unnatural responses, and therefore requires some experience with such systems in order to improve interactions. This problem is re-examined here, with the aim of evaluat-ing how human users are in(cid:13)uenced by utterances built into a system based on the Wizard-of-Oz method. We report results which show that back-channel responses and brief con(cid:12)rmations from our system, have the e(cid:11)ects of prompting human spoken interactions and providing more human satisfaction.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116353496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Combining connectionist multi-band and full-band probability streams for speech recognition of natural numbers 结合连接多频带和全频带概率流用于自然数语音识别
5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-404
Nikki Mirghafori, N. Morgan
{"title":"Combining connectionist multi-band and full-band probability streams for speech recognition of natural numbers","authors":"Nikki Mirghafori, N. Morgan","doi":"10.21437/ICSLP.1998-404","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-404","url":null,"abstract":"Multi-band automatic speech recognition is a new and ex-ploratory area of speech recognition which has been getting much attention in the research community. It has been shown that multi-band ASR reduces word error in noisy conditions, particularly in the case of narrow band noise. In this work we show that multi-band ASR could be used to improve the speech recognition accuracy of natural numbers for clean speech when the multi-band (MB) information stream is used in addition to the full-band (FB) one. We also observe that a similar combination method significantly reduces the error rate on reverberant speech. Finally, we analyze the error patterns of the full-band and multi-band paradigms to understand why the combination of the two streams is effective.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123475973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
On the significance of temporal masking in speech coding 论时间掩蔽在语音编码中的意义
5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-381
J. Skoglund, W. Kleijn
{"title":"On the significance of temporal masking in speech coding","authors":"J. Skoglund, W. Kleijn","doi":"10.21437/ICSLP.1998-381","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-381","url":null,"abstract":"This paper addresses the issue of masking of noise in voiced speech. First, we examine the audibility of cyclostationary narrowband noise added to voiced speech generated by synthetic excitation. Varying the temporal location of noise within a pitch cycle corresponds to varying its phase spectrum. Using this fact, we find that a phase change of the noise in the high frequency re-gion is more perceptible for a low-pitched sound than for a high-pitched sound. We propose a pitch-dependent temporal weighting function and we show experimentally that it is beneficial to the quantization of pitch-cycle waveforms.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"33 1-2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123606445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Robust speech recognition using discriminative stream weighting and parameter interpolation 采用判别流加权和参数插值的鲁棒语音识别
5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-319
Stephen M. Chu, Yunxin Zhao
{"title":"Robust speech recognition using discriminative stream weighting and parameter interpolation","authors":"Stephen M. Chu, Yunxin Zhao","doi":"10.21437/ICSLP.1998-319","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-319","url":null,"abstract":"This paper presents a method to improve the robustness of speech recognition in noisy conditions. It has been shown that using dynamic features in addition to static features can improve the noise robustness of speech recognizers. In this work we show that in a continuous-density Hidden Markov Model (HMM) based speech recognition system, weighting the contribution of the dynamic features according to SNR levels can further improve the performance, and we propose a two-step scheme to adapt the weights for a given Signal to Noise Ratio (SNR). The first step is to obtain the optimal weights for a set of selected SNR levels by discriminative training. The Generalized Probabilistic Decent (GPD) framework is used in our experiments. The second step is to interpolate the set of SNR-specific weights obtained in step one for a new SNR condition. Experimental results obtained by the proposed technique is encouraging.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123640325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Wavelet transform-based speech enhancement 基于小波变换的语音增强
5th International Conference on Spoken Language Processing (ICSLP 1998) Pub Date : 1998-11-30 DOI: 10.21437/ICSLP.1998-348
E. Ambikairajah, G. Tattersall, A. Davis
{"title":"Wavelet transform-based speech enhancement","authors":"E. Ambikairajah, G. Tattersall, A. Davis","doi":"10.21437/ICSLP.1998-348","DOIUrl":"https://doi.org/10.21437/ICSLP.1998-348","url":null,"abstract":"This paper describes a speech enhancement system using a novel combination of a Fast Wavelet Transform structure, together with “Wiener filtering” in the wavelet domain. The specific application of interest is the enhancement of speech when a cellular phone is used within a moving vehicle. Subjective tests carried out using speech with additive vehicle noise at a signal-to-noise ratio of 10 dB indicate that the Wavelet transform-based Wiener filtering approach works well. In particular, the technique was compared to several other common enhancement methods such as thresholding applied in the wavelet domain, FFT-based Wiener filtering, and spectral subtraction, and was found to outperform these other techniques.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121971433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信