2008 IEEE Spoken Language Technology Workshop最新文献

筛选
英文 中文
Quantitative evaluation of dialog corpora acquired through different techniques 不同技术对白语料库的定量评价
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777851
D. Griol, L. Hurtado, E. Segarra, E. Arnal
{"title":"Quantitative evaluation of dialog corpora acquired through different techniques","authors":"D. Griol, L. Hurtado, E. Segarra, E. Arnal","doi":"10.1109/SLT.2008.4777851","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777851","url":null,"abstract":"In this paper, we present the results of the comparison between three corpora acquired by means of different techniques. The first corpus was acquired using the Wizard of Oz technique. A statistical user simulation technique has been developed for the acquisition of the second corpus. In this technique, the next user answer is selected by means of a classification process that takes into account the previous user turns, the last system answer and the objective of the dialog. Finally, a dialog simulation technique has been developed for the acquisition of the third corpus. This technique uses a random selection of the user and system turns, defining stop conditions for automatically deciding if the simulated dialog is successful or not. We use several evaluation measures proposed in previous research to compare between our three acquired corpora, and then discuss the similarities and differences with regard to these measures.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134423898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time speech recognition captioning of events and meetings 实时语音识别事件和会议的字幕
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777874
Gilles Boulianne, M. Boisvert, Frédéric Osterrath
{"title":"Real-time speech recognition captioning of events and meetings","authors":"Gilles Boulianne, M. Boisvert, Frédéric Osterrath","doi":"10.1109/SLT.2008.4777874","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777874","url":null,"abstract":"Real-time speech recognition captioning has not progressed much, beyond television broadcast, to other tasks like meetings in the workplace. A number of obstacles prevent this transition, such as proper means to receive and display captions, or on-site shadow speakers costs. More problematic is the insufficient performance of speech recognition for less formal and one-time events. We describe how we developed a mobile platform for remote captioning during trials in several conferences and meetings. We also show that sentence selection based on relative entropy allows training of adequate language models with small amounts of in-domain data, making real-time captioning of an event possible with only a few hours of preparation.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132109688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Open vocabulary spoken document retrieval by subword sequence obtained from speech recognizer 利用语音识别器获得的子词序列进行开放词汇口语文档检索
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777900
Go Kuriki, Y. Itoh, K. Kojima, M. Ishigame, Kazuyo Tanaka, Shi-wook Lee
{"title":"Open vocabulary spoken document retrieval by subword sequence obtained from speech recognizer","authors":"Go Kuriki, Y. Itoh, K. Kojima, M. Ishigame, Kazuyo Tanaka, Shi-wook Lee","doi":"10.1109/SLT.2008.4777900","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777900","url":null,"abstract":"We present a method for open vocabulary retrieval based on a spoken document retrieval (SDR) system using subword models. The present paper proposes a new approach to open vocabulary SDR system using subword models which do not require subword recognition. Instead, subword sequences are obtained from the phone sequence outputted containing an out of vocabulary (OOV) word, a speech recognizer outputs a word sequence whose phone sequence is considered to be similar to the OOV word. When OOV words are provided in a query, the proposed system is able to retrieve the target section by comparing the phone sequences of the query and the word sequence generated by the speech recognizer.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123461717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic title generation for Chinese spoken documents with a delicate scored Viterbi algorithm 基于精细评分维特比算法的中文口语文档自动标题生成
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777866
Sheng-yi Kong, Chien-Chih Wang, Ko-chien Kuo, Lin-Shan Lee
{"title":"Automatic title generation for Chinese spoken documents with a delicate scored Viterbi algorithm","authors":"Sheng-yi Kong, Chien-Chih Wang, Ko-chien Kuo, Lin-Shan Lee","doi":"10.1109/SLT.2008.4777866","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777866","url":null,"abstract":"Automatic title generation for spoken documents is believed to be an important key for browsing and navigation over huge quantities of multimedia content. A new framework of automatic title generation for Chinese spoken documents is proposed in this paper using a delicate scored Viterbi algorithm performed over automatically generated text summaries of the testing spoken documents. The Viterbi beam search is guided by a delicate score evaluated from three sets of models: term selection model tells the most suitable terms to be included in the title, term ordering model gives the best ordering of the terms to make the title readable, and title length model tells the reasonable length of the title. The models are trained from a training corpus which is not required to be matched with the testing spoken documents. Both objective evaluation based on F1 measure and subjective human evaluation for relevance and readability indicated the approach is very attractive.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122119711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Name aware speech-to-speech translation for English/Iraqi 英文/伊拉克语语音翻译
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777887
R. Prasad, C. Moran, F. Choi, R. Meermeier, S. Saleem, C. Kao, D. Stallard, P. Natarajan
{"title":"Name aware speech-to-speech translation for English/Iraqi","authors":"R. Prasad, C. Moran, F. Choi, R. Meermeier, S. Saleem, C. Kao, D. Stallard, P. Natarajan","doi":"10.1109/SLT.2008.4777887","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777887","url":null,"abstract":"In this paper, we describe a novel approach that exploits intra-sentence and dialog-level context for improving translation performance on spoken Iraqi utterances that contain named entities (NEs). Dialog-level context is used to predict whether the Iraqi response is likely to contain names and the intra-sentence context is used to determine words that are named entities. While we do not address the problem of translating out-of-vocabulary (OOV) NEs in spoken utterances, we show that our approach is capable of translating OOV names in text input. To demonstrate efficacy of our approach, we present results on internal test set as well as the 2008 June DARPA TRANSTAC name evaluation set.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115530123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Effects of self-disclosure and empathy in human-computer dialogue 自我表露与共情在人机对话中的作用
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777852
Ryuichiro Higashinaka, Kohji Dohsaka, Hideki Isozaki
{"title":"Effects of self-disclosure and empathy in human-computer dialogue","authors":"Ryuichiro Higashinaka, Kohji Dohsaka, Hideki Isozaki","doi":"10.1109/SLT.2008.4777852","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777852","url":null,"abstract":"To build trust or cultivate long-term relationships with users, conversational systems need to perform social dialogue. To date, research has primarily focused on the overall effect of social dialogue in human-computer interaction, leading to little work on the effects of individual linguistic phenomena within social dialogue. This paper investigates such individual effects through dialogue experiments. Focusing on self-disclosure and empathic utterances (agreement and disagreement), we empirically calculate their contributions to the dialogue quality. Our analysis shows that (1) empathic utterances by users are strong indicators of increasing closeness and user satisfaction, (2) the system's empathic utterances are effective for inducing empathy from users, and (3) self-disclosure by users increases when users have positive preferences on topics being discussed.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126165988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
IslEnquirer: Social user model acquisition through network analysis and interactive learning IslEnquirer:通过网络分析和互动学习获取社交用户模型
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777854
F. Putze, H. Holzapfel
{"title":"IslEnquirer: Social user model acquisition through network analysis and interactive learning","authors":"F. Putze, H. Holzapfel","doi":"10.1109/SLT.2008.4777854","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777854","url":null,"abstract":"We present an approach to introduce social awareness in interactive systems. The IslEnquirer is a system which automatically builds social user models. It initializes the models by social network analysis of available offline data. These models are then verified and extended by interactive learning which is carried out by a robot initiated spoken dialog with the user.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130043457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Joint generative and discriminative models for spoken language understanding 口语理解的联合生成和判别模型
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777840
Marco Dinarelli, Alessandro Moschitti, G. Riccardi
{"title":"Joint generative and discriminative models for spoken language understanding","authors":"Marco Dinarelli, Alessandro Moschitti, G. Riccardi","doi":"10.1109/SLT.2008.4777840","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777840","url":null,"abstract":"Spoken Language Understanding aims at mapping a natural language spoken sentence into a semantic representation. In the last decade two main approaches have been pursued: generative and discriminative models. The former is more robust to overfitting whereas the latter is more robust to many irrelevant features. Additionally, the way in which these approaches encode prior knowledge is very different and their relative performance changes based on the task. In this paper we describe a training framework where both models are used: a generative model produces a list of ranked hypotheses whereas a discriminative model, depending on string kernels and Support Vector Machines, re-ranks such list. We tested such approach on a new corpus produced in the European LUNA project. The results show a large improvement on the state-of-the-art in concept segmentation and labeling.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"451 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133270002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Discriminative learning using linguistic features to rescore n-best speech hypotheses 基于语言特征的判别学习对n个最佳语音假设进行评分
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777849
Maria Georgescul, Manny Rayner, P. Bouillon, Nikos Tsourakis
{"title":"Discriminative learning using linguistic features to rescore n-best speech hypotheses","authors":"Maria Georgescul, Manny Rayner, P. Bouillon, Nikos Tsourakis","doi":"10.1109/SLT.2008.4777849","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777849","url":null,"abstract":"We describe how we were able to improve the accuracy of a medium-vocabulary spoken dialog system by rescoring the list of n-best recognition hypotheses using a combination of acoustic, syntactic, semantic and discourse information. The non-acoustic features are extracted from different intermediate processing results produced by the natural language processing module, and automatically filtered. We apply discriminative support vector learning designed for re-ranking, using both word error rate and semantic error rate as ranking target value, and evaluating using five-fold cross-validation; to show robustness of our method, confidence intervals for word and semantic error rates are computed via bootstrap sampling. The reduction in semantic error rate, from 19% to 11%, is statistically significant at 0.01 level.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131517301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A research bed for unit selection based text to speech synthesis 基于单元选择的文本到语音合成研究平台
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777882
K. Sarathy, A. Ramakrishnan
{"title":"A research bed for unit selection based text to speech synthesis","authors":"K. Sarathy, A. Ramakrishnan","doi":"10.1109/SLT.2008.4777882","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777882","url":null,"abstract":"The paper describes a modular, unit selection based TTS framework, which can be used as a research bed for developing TTS in any new language, as well as studying the effect of changing any parameter during synthesis. Using this framework, TTS has been developed for Tamil. Synthesis database consists of 1027 phonetically rich pre-recorded sentences. This framework has already been tested for Kannada. Our TTS synthesizes intelligible and acceptably natural speech, as supported by high mean opinion scores. The framework is further optimized to suit embedded applications like mobiles and PDAs. We compressed the synthesis speech database with standard speech compression algorithms used in commercial GSM phones and evaluated the quality of the resultant synthesized sentences. Even with a highly compressed database, the synthesized output is perceptually close to that with uncompressed database. Through experiments, we explored the ambiguities in human perception when listening to Tamil phones and syllables uttered in isolation, thus proposing to exploit the misperception to substitute for missing phone contexts in the database. Listening experiments have been conducted on sentences synthesized by deliberately replacing phones with their confused ones.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132890803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信