2008 IEEE Spoken Language Technology Workshop最新文献

筛选
英文 中文
Morphological random forests for language modeling of inflectional languages 形态学随机森林用于屈折语言的语言建模
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777872
I. Oparin, O. Glembek, L. Burget, J. Černocký
{"title":"Morphological random forests for language modeling of inflectional languages","authors":"I. Oparin, O. Glembek, L. Burget, J. Černocký","doi":"10.1109/SLT.2008.4777872","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777872","url":null,"abstract":"In this paper, we are concerned with using decision trees (DT) and random forests (RF) in language modeling for Czech LVCSR. We show that the RF approach can be successfully implemented for language modeling of an inflectional language. Performance of word-based and morphological DTs and RFs was evaluated on lecture recognition task. We show that while DTs perform worse than conventional trigram language models (LM), RFs of both kind outperform the latter. WER (up to 3.4% relative) and perplexity (10%) reduction over the trigram model can be gained with morphological RFs. Further improvement is obtained after interpolation of DT and RF LMs with the trigram one (up to 15.6% perplexity and 4.8% WER relative reduction). In this paper we also investigate distribution of morphological feature types chosen for splitting data at different levels of DTs.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121807411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Performance analysis of spectral and prosodic features and their fusion for emotion recognition in speech 语音情感识别中频谱和韵律特征的性能分析及其融合
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777903
Manish Gaurav
{"title":"Performance analysis of spectral and prosodic features and their fusion for emotion recognition in speech","authors":"Manish Gaurav","doi":"10.1109/SLT.2008.4777903","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777903","url":null,"abstract":"In this paper, we study the performance of different prosody and spectral features of speech on an emotion detection task. In particular, a feature selection algorithm has been used to assess the relevancy of the different features. Gaussian mixtures models have been used to model the features extracted at the frame-level, while support vector machines (SVM) and k-nearest neighbor (k-NN) methods have been used to model the features extracted at the utterance level. We use a normalization approach (T-norm) to combine the scores from the different models. The results using the above approach are reported for the Berlin emotional database corpus and the task consisted of classifying the six emotions namely - anger, happiness, neutral, sadness, boredom and anxiety. We show that the use of feature selection algorithm improves the result, while in addition the fusion of GMM and SVM results in an overall accuracy of 75.4% for the above task.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127151341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Impact of dynamic model adaptation beyond speech recognition 动态模型自适应对语音识别的影响
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777894
Fernando Batista, R. Amaral, I. Trancoso, N. Mamede
{"title":"Impact of dynamic model adaptation beyond speech recognition","authors":"Fernando Batista, R. Amaral, I. Trancoso, N. Mamede","doi":"10.1109/SLT.2008.4777894","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777894","url":null,"abstract":"The application of speech recognition to live subtitling of Broadcast News has motivated the adaptation of the lexical and language models of the recognizer on a daily basis with text material retrieved from online newspapers. This paper studies the impact of this adaptation on two of the blocks following the speech recognition module: capitalization and topic indexation. We describe and evaluate different adaptation approaches that try to explore the language dynamics.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"515 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132754505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Recent improvements in BBN's English/Iraqi speech-to-speech translation system BBN英语/伊拉克语语音翻译系统的最新改进
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777886
F. Choi, S. Tsakalidis, S. Saleem, C. Kao, R. Meermeier, K. Krstovski, C. Moran, Krishna Subramanian, R. Prasad, P. Natarajan
{"title":"Recent improvements in BBN's English/Iraqi speech-to-speech translation system","authors":"F. Choi, S. Tsakalidis, S. Saleem, C. Kao, R. Meermeier, K. Krstovski, C. Moran, Krishna Subramanian, R. Prasad, P. Natarajan","doi":"10.1109/SLT.2008.4777886","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777886","url":null,"abstract":"We report on recent improvements in our English/Iraqi Arabic speech-to-speech translation system. User interface improvements include a novel parallel approach to user confirmation which makes confirmation cost-free in terms of dialog duration. Automatic speech recognition improvements include the incorporation of state-of-the-art techniques in feature transformation and discriminative training. Machine translation improvements include a novel combination of multiple alignments derived from various pre-processing techniques, such as Arabic segmentation and English word compounding, higher order N-grams for target language model, and use of context in form of semantic classes and part-of-speech tags.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133218748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Correcting asr outputs: Specific solutions to specific errors in French 纠正asr输出:针对法语中特定错误的特定解决方案
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777878
Richard Dufour, Y. Estève
{"title":"Correcting asr outputs: Specific solutions to specific errors in French","authors":"Richard Dufour, Y. Estève","doi":"10.1109/SLT.2008.4777878","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777878","url":null,"abstract":"Automatic speech recognition (ASR) systems are used in a large number of applications, in spite of the inevitable recognition errors. In this study we propose a pragmatic approach to automatically repair ASR outputs by taking into account linguistic and acoustic information, using formal rules or stochastic methods. The proposed strategy consists in developing a specific correction solution for each specific kind of errors. In this paper, we apply this strategy on two case studies specific to French language. We show that it is possible, on automatic transcriptions of French broadcast news, to decrease the error rate of a specific error by 11.4% in one of two the case studies, and 86.4% in the other one. These results are encouraging and show the interest of developing more specific solutions to cover a wider set of errors in a future work.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128957470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Automatic keyword extraction for the meeting corpus using supervised approach and bigram expansion 基于监督方法和双字扩展的会议语料库关键字自动提取
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/slt.2008.4777870
Fei Liu, Feifan Liu, Yang Liu
{"title":"Automatic keyword extraction for the meeting corpus using supervised approach and bigram expansion","authors":"Fei Liu, Feifan Liu, Yang Liu","doi":"10.1109/slt.2008.4777870","DOIUrl":"https://doi.org/10.1109/slt.2008.4777870","url":null,"abstract":"","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"28 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116859026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Continuous topic language modeling for speech recognition 语音识别的连续主题语言建模
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777873
C. Chueh, Jen-Tzung Chien
{"title":"Continuous topic language modeling for speech recognition","authors":"C. Chueh, Jen-Tzung Chien","doi":"10.1109/SLT.2008.4777873","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777873","url":null,"abstract":"Continuous representation of word sequence can effectively solve data sparseness problem in n-gram language model, where the discrete variables of words are represented and the unseen events are prone to happen. This problem is increasingly severe when extracting long-distance regularities for high-order n-gram model. Rather than considering discrete word space, we construct the continuous space of word sequence where the latent topic information is extracted. The continuous vector is formed by the topic posterior probabilities and the least-squares projection matrix from discrete word space to continuous topic space is estimated accordingly. The unseen words can be predicted through the new continuous latent topic language model. In the experiments on continuous speech recognition, we obtain significant performance improvement over the conventional topic-based language model.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116273104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
“Who is this” quiz dialogue system and users' evaluation “这是谁”问答对话系统和用户评价
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777862
M. Sawaki, Yasuhiro Minami, Ryuichiro Higashinaka, Kohji Dohsaka, Eisaku Maeda
{"title":"“Who is this” quiz dialogue system and users' evaluation","authors":"M. Sawaki, Yasuhiro Minami, Ryuichiro Higashinaka, Kohji Dohsaka, Eisaku Maeda","doi":"10.1109/SLT.2008.4777862","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777862","url":null,"abstract":"In order to design a dialogue system that users enjoy and want to be near for a long time, it is important to know the effect of the system's action on users. This paper describes ldquoWho is thisrdquo quiz dialogue system and its users' evaluation. Its quiz-style information presentation has been found effective for educational tasks. In our ongoing effort to make it closer to a conversational partner, we implemented the system as a stuffed-toy (or CG equivalent). Quizzes are automatically generated from Wikipedia articles, rather than from hand-crafted sets of biographical facts. Network mining is utilized to prepare adaptive system responses. Experiments showed the effectiveness of person network and the relationship of user attribute and interest level.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"242 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114451000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Caller Experience: A method for evaluating dialog systems and its automatic prediction 呼叫体验:一种评估对话系统及其自动预测的方法
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777857
Keelan Evanini, P. Hunter, J. Liscombe, David Suendermann-Oeft, K. Dayanidhi, R. Pieraccini
{"title":"Caller Experience: A method for evaluating dialog systems and its automatic prediction","authors":"Keelan Evanini, P. Hunter, J. Liscombe, David Suendermann-Oeft, K. Dayanidhi, R. Pieraccini","doi":"10.1109/SLT.2008.4777857","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777857","url":null,"abstract":"In this paper we introduce a subjective metric for evaluating the performance of spoken dialog systems, caller experience (CE). CE is a useful metric for tracking the overall performance of a system in deployment, as well as for isolating individual problematic calls in which the system underperforms. The proposed CE metric differs from most performance evaluation metrics proposed in the past in that it is a) a subjective, qualitative rating of the call, and b) provided by expert, external listeners, not the callers themselves. The results of an experiment in which a set of human experts listened to the same calls three times are presented. The fact that these results show a high level of agreement among different listeners, despite the subjective nature of the task, demonstrates the validity of using CE as a standard metric. Finally, an automated rating system using objective measures is shown to perform at the same high level as the humans. This is an important advance, since it provides a way to reduce the human labor costs associated with producing a reliable CE.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127400180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Starting to cook a tutoring dialogue system 开始做家教对话系统
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777861
Filipe M. Martins, Joana Paulo Pardal, Luís Franqueira, Pedro Arez, N. Mamede
{"title":"Starting to cook a tutoring dialogue system","authors":"Filipe M. Martins, Joana Paulo Pardal, Luís Franqueira, Pedro Arez, N. Mamede","doi":"10.1109/SLT.2008.4777861","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777861","url":null,"abstract":"This paper presents a system that helps you cook a recipe through a spoken dialogue tutoring session. We report our experience while creating the first version of a tutoring dialogue system that helps the user cook a selected dish. Having a working framework to support us with the creation of the cooking assistant, the main challenge we faced was the change of paradigm: instead of the system being driven by the user, the user is instructed by the system. The result is a system capable of dictating generic contents to the user. On top of it, the system can be used in several domains where the goal is not the replacement of the user but providing some assistance while (s)he performs some procedural task.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124710626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信