2008 IEEE Spoken Language Technology Workshop最新文献

筛选
英文 中文
Modelling multimodal user ID in dialogue 在对话中建模多模态用户ID
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777853
H. Holzapfel, A. Waibel
{"title":"Modelling multimodal user ID in dialogue","authors":"H. Holzapfel, A. Waibel","doi":"10.1109/SLT.2008.4777853","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777853","url":null,"abstract":"This paper presents an approach to model user ID in dialogue. A belief network is used to integrate ID classifiers, such as face ID and voice ID, and person related information, such as the first name and last name of a person from speech recognition or spelling. Different network structures are analyzed and compared with each other and are compared with a rule-based user model. The approach is evaluated on dialogue data collected in a person identification scenario, which includes both, identification of known persons and interactive learning of names and ID of unknown persons.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124961889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Identifying salient utterances of online spoken documents using descriptive hypertext 使用描述性超文本识别在线口语文档的显著话语
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777868
Xiao-Dan Zhu, Siavash Kazemian, Gerald Penn
{"title":"Identifying salient utterances of online spoken documents using descriptive hypertext","authors":"Xiao-Dan Zhu, Siavash Kazemian, Gerald Penn","doi":"10.1109/SLT.2008.4777868","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777868","url":null,"abstract":"The Internet has become an important supply channel of spoken documents. Efficient ways of navigating their content are highly desirable. This paper aims to identify the most salient utterances from online spoken documents using relevant hypertext that encapsulates key information. Experimental results show that hypertext features are helpful when properly utilized and if the bit rates used to compress the spoken documents are reasonable.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"7 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123610746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint n-best rescoring for repeated utterances in spoken dialog systems 口语对话系统中重复话语的联合n-best评分
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777858
D. Bohus, G. Zweig, Patrick Nguyen, Xiao Li
{"title":"Joint n-best rescoring for repeated utterances in spoken dialog systems","authors":"D. Bohus, G. Zweig, Patrick Nguyen, Xiao Li","doi":"10.1109/SLT.2008.4777858","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777858","url":null,"abstract":"Due to speech recognition errors, repetitions are a frequent phenomenon in spoken dialog systems. In previous work (G. Zweig et al., 2008) we have proposed a joint decoding model that can leverage structural relationships between repeated utterances for improving recognition performance. In this paper we extend this work in two directions. First, we propose a direct, classification-based model for the same task. The new model can leverage features that were fundamentally hard to capture in the previous framework (e.g. spellings, false-starts, etc.) and leads to an additional performance improvement. Second, we show how both models can be used to perform a combined rescoring of two n-best lists that are part of a repetition pair.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130072052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
PDTSL: An annotated resource for speech reconstruction PDTSL:语音重建的带注释资源
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777848
Jan Hajic, Silvie Cinková, Marie Mikulová, P. Pajas, J. Ptáček, J. Toman, Zdenka Uresová
{"title":"PDTSL: An annotated resource for speech reconstruction","authors":"Jan Hajic, Silvie Cinková, Marie Mikulová, P. Pajas, J. Ptáček, J. Toman, Zdenka Uresová","doi":"10.1109/SLT.2008.4777848","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777848","url":null,"abstract":"We present a description of a new resource (Prague Dependency Treebank of Spoken Language) being created for English and Czech to be used for the task of speech understanding, broad natural language analysis for dialog systems and other speech-related tasks, including speech editing. The resources we have created so far contain audio and a standard transcription of spontaneous speech, but as a novel layer, we add an edited (ldquoreconstructedrdquo) version of the spoken utterances. These edits go beyond the scope of current speech reconstruction efforts in that we allow, on top of the usual deletions of speech artifacts, fillers, etc. also for word modifications, insertions and word order changes. We have used both monologue and dialogue recordings in English and Czech to verify the feasibility of such transcription. We have also assessed the quality of the resulting annotation since the relative freedom of the editing raises an issue of what a ldquocorrectrdquo annotation is.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117227379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Automatic framenet-based annotation of conversational speech 会话语音的自动基于框架的注释
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777843
Bonaventura Coppola, Alessandro Moschitti, Sara Tonelli, G. Riccardi
{"title":"Automatic framenet-based annotation of conversational speech","authors":"Bonaventura Coppola, Alessandro Moschitti, Sara Tonelli, G. Riccardi","doi":"10.1109/SLT.2008.4777843","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777843","url":null,"abstract":"Current Spoken Language Understanding technology is based on a simple concept annotation of word sequences, where the interdependencies between concepts and their compositional semantics are neglected. This prevents an effective handling of language phenomena, with a consequential limitation on the design of more complex dialog systems. In this paper, we argue that shallow semantic representation as formulated in the Berkeley FrameNet Project may be useful to improve the capability of managing more complex dialogs. To prove this, the first step is to show that a FrameNet parser of sufficient accuracy can be designed for conversational speech. We show that exploiting a small set of FrameNet-based manual annotations, it is possible to design an effective semantic parser. Our experiments on an Italian spoken dialog corpus, created within the LUNA project, show that our approach is able to automatically annotate unseen dialog turns with a high accuracy.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134045737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Simultaneous machine translation of german lectures into english: Investigating research challenges for the future 德语讲座同声翻译成英语:调查未来的研究挑战
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777883
Matthias Wölfel, M. Kolss, Florian Kraft, J. Niehues, M. Paulik, A. Waibel
{"title":"Simultaneous machine translation of german lectures into english: Investigating research challenges for the future","authors":"Matthias Wölfel, M. Kolss, Florian Kraft, J. Niehues, M. Paulik, A. Waibel","doi":"10.1109/SLT.2008.4777883","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777883","url":null,"abstract":"An increasingly globalized world fosters the exchange of students, researchers or employees. As a result, situations in which people of different native tongues are listening to the same lecture become more and more frequent. In many such situations, human interpreters are prohibitively expensive or simply not available. For this reason, and because first prototypes have already demonstrated the feasibility of such systems, automatic translation of lectures receives increasing attention. A large vocabulary and strong variations in speaking style make lecture translation a challenging, however not hopeless, task. The scope of this paper is to investigate a variety of challenges and to highlight possible solutions in building a system for simultaneous translation of lectures from German to English. While some of the investigated challenges are more general, e.g. environment robustness, other challenges are more specific for this particular task, e.g. pronunciation of foreign words or sentence segmentation. We also report our progress in building an end-to-end system and analyze its performance in terms of objective and subjective measures.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"28 9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132723235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Speaker turn characterization for spoken dialog system monitoring and adaptation 针对口语对话系统监测和适应的说话人转向表征
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777860
Géraldine Damnati, Frédéric Béchet, R. Mori
{"title":"Speaker turn characterization for spoken dialog system monitoring and adaptation","authors":"Géraldine Damnati, Frédéric Béchet, R. Mori","doi":"10.1109/SLT.2008.4777860","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777860","url":null,"abstract":"This paper describes an utterance classification method based on a multiple decoding scheme. We use the Spoken Language Understanding (SLU) strategy proposed within the European project LUNA. The goal of this classification process is to characterize each speaker's turn, in a dialog context, according to different categories relevant from an SLU point of view: out-of-domain messages, requests not covered by the interpretation module, frequent requests,.... These categories are used for two purposes in an off-line mode: system monitoring for detecting changes in users' behaviour and system adaptation by selecting dialogs likely to contain some phenomenon poorly covered by the models for an active learning scheme. All the models and the evaluations are performed on the France Telecom FT3000 corpus.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131073803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Class-based named entity translation in a speech to speech translation system 语音到语音翻译系统中基于类的命名实体翻译
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777888
S. Maskey, Martin Cmejrek, Bowen Zhou, Yuqing Gao
{"title":"Class-based named entity translation in a speech to speech translation system","authors":"S. Maskey, Martin Cmejrek, Bowen Zhou, Yuqing Gao","doi":"10.1109/SLT.2008.4777888","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777888","url":null,"abstract":"Named entity (NE) translation is a challenging problem in machine translation (MT). Most of the training bi-text corpora for MT lack enough samples of NEs to cover the wide variety of contexts NEs can appear in. In this paper, we present a technique to translate NEs based on their NE types in addition to a phrase-based translation model. Our NE translation model is based on a syntax-based system similar to the work of Chiang (2005); but we produce syntax-based rules with non-terminals as NE types instead of general non-terminals. Such class-based rules allow us to better generalize the context NEs. We show that our proposed method obtains an improvement of 0.66 BLEU score absolute as well as 0.26% in F1-measure over the baseline of phrase-based model in NE test set.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123634333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Using prior knowledge to assess relevance in speech summarization 运用先验知识评估语音摘要的相关性
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777867
Ricardo Ribeiro, David Martins de Matos
{"title":"Using prior knowledge to assess relevance in speech summarization","authors":"Ricardo Ribeiro, David Martins de Matos","doi":"10.1109/SLT.2008.4777867","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777867","url":null,"abstract":"We explore the use of topic-based automatically acquired prior knowledge in speech summarization, assessing its influence throughout several term weighting schemes. All information is combined using latent semantic analysis as a core procedure to compute the relevance of the sentence-like units of the given input source. Evaluation is performed using the self-information measure, which tries to capture the informativeness of the summary in relation to the summarized input source. The similarity of the output summaries of the several approaches is also analyzed.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124939076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Methods for improving the quality of syllable based speech synthesis 提高基于音节的语音合成质量的方法
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777832
Y. R. Venugopalakrishna, M. V. Vinodh, H. Murthy, C. S. Ramalingam
{"title":"Methods for improving the quality of syllable based speech synthesis","authors":"Y. R. Venugopalakrishna, M. V. Vinodh, H. Murthy, C. S. Ramalingam","doi":"10.1109/SLT.2008.4777832","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777832","url":null,"abstract":"Our earlier work [1] on speech synthesis has shown that syllables can produce reasonably natural quality speech. Nevertheless, audible artifacts are present due to discontinuities in pitch, energy, and formant trajectories at the joining point of the units. In this paper, we present some minimal signal modification techniques for reducing these artifacts.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127221713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信