2008 IEEE Spoken Language Technology Workshop最新文献

筛选
英文 中文
Speech-to-text input method for web system using JavaScript 语音转文本输入法的web系统,使用JavaScript
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777877
R. Nisimura, Jumpei Miyake, Hideki Kawahara, T. Irino
{"title":"Speech-to-text input method for web system using JavaScript","authors":"R. Nisimura, Jumpei Miyake, Hideki Kawahara, T. Irino","doi":"10.1109/SLT.2008.4777877","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777877","url":null,"abstract":"We have developed a speech-to-text input method for web systems. The system is provided as a JavaScript library including an Ajax-like mechanism based on a Java applet, CGI programs, and dynamic HTML documents. It allows users to access voice-enabled web pages without requiring special browsers. Web developers can embed it on their web page by inserting only one line in the header field of an HTML document. This study also aims at observing natural spoken interactions in personal environments. We have succeeded in collecting 4,003 inputs during a period of seven months via our public Japanese ASR server. In order to cover out-of-vocabulary words to cope with some proper nouns, a web page to register new words into the language model are developed. As a result, we could obtain an improvement of 0.8% in the recognition accuracy. With regard to the acoustical conditions, an SNR of 25.3 dB was observed.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131683310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Accented Indian english ASR: Some early results 印度口音英语ASR:一些早期结果
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777881
Kaustubh Kulkarni, Sailik Sengupta, V. Ramasubramanian, Josef G. Bauer, G. Stemmer
{"title":"Accented Indian english ASR: Some early results","authors":"Kaustubh Kulkarni, Sailik Sengupta, V. Ramasubramanian, Josef G. Bauer, G. Stemmer","doi":"10.1109/SLT.2008.4777881","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777881","url":null,"abstract":"The problem of the effect of accent on the performance of Automatic Speech Recognition (ASR) systems is well known. In this paper, we study the effect of accent variability on the performance of the Indian English ASR task. We evaluate the test vocabularies on HMMs trained on (a) Accent specific training data (b) Accent pooled training data which combines all the accent specific training data (c) Accent pooled training data of reduced size matching the size of the accent specific training data. We demonstrate that the accent pooled training set performs the best on phonetically rich isolated word recognition task. But the accent specific HMMs perform better than the reduced accent pooled HMMs, indicating a possible approach of using a first stage accent identification to choose the correct accent trained HMMs for further recognition.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130046411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Evaluation of a spoken dialogue system for controlling a Hifi audio system 用于控制高保真音频系统的口语对话系统的评价
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777859
F. Fernández-Martínez, Juan Blázquez, J. Ferreiros, R. Barra-Chicote, Javier Macias-Guarasa, J. M. Lucas
{"title":"Evaluation of a spoken dialogue system for controlling a Hifi audio system","authors":"F. Fernández-Martínez, Juan Blázquez, J. Ferreiros, R. Barra-Chicote, Javier Macias-Guarasa, J. M. Lucas","doi":"10.1109/SLT.2008.4777859","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777859","url":null,"abstract":"In this paper, a Bayesian networks, BNs, approach to dialogue modelling is evaluated in terms of a battery of both subjective and objective metrics. A significant effort in improving the contextual information handling capabilities of the system has been done. Consequently, besides typical dialogue measurement rates for usability like task or dialogue completion rates, dialogue time, etc. we have included a new figure measuring the contextuality of the dialogue as the number of turns where contextual information is helpful for dialogue resolution. The evaluation is developed through a set of predefined scenarios according to different initiative styles and focusing on the impact of the user's level of experience.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132025327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
A syntactic language model based on incremental CCG parsing 基于增量CCG解析的句法语言模型
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777876
Hany Hassan, K. Sima'an, Andy Way
{"title":"A syntactic language model based on incremental CCG parsing","authors":"Hany Hassan, K. Sima'an, Andy Way","doi":"10.1109/SLT.2008.4777876","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777876","url":null,"abstract":"Syntactically-enriched language models (parsers) constitute a promising component in applications such as machine translation and speech-recognition. To maintain a useful level of accuracy, existing parsers are non-incremental and must span a combinatorially growing space of possible structures as every input word is processed. This prohibits their incorporation into standard linear-time decoders. In this paper, we present an incremental, linear-time dependency parser based on Combinatory Categorial Grammar (CCG) and classification techniques. We devise a deterministic transform of CCG-bank canonical derivations into incremental ones, and train our parser on this data. We discover that a cascaded, incremental version provides an appealing balance between efficiency and accuracy.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114517475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Class-based named entity translation in a speech to speech translation system 语音到语音翻译系统中基于类的命名实体翻译
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777888
S. Maskey, Martin Cmejrek, Bowen Zhou, Yuqing Gao
{"title":"Class-based named entity translation in a speech to speech translation system","authors":"S. Maskey, Martin Cmejrek, Bowen Zhou, Yuqing Gao","doi":"10.1109/SLT.2008.4777888","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777888","url":null,"abstract":"Named entity (NE) translation is a challenging problem in machine translation (MT). Most of the training bi-text corpora for MT lack enough samples of NEs to cover the wide variety of contexts NEs can appear in. In this paper, we present a technique to translate NEs based on their NE types in addition to a phrase-based translation model. Our NE translation model is based on a syntax-based system similar to the work of Chiang (2005); but we produce syntax-based rules with non-terminals as NE types instead of general non-terminals. Such class-based rules allow us to better generalize the context NEs. We show that our proposed method obtains an improvement of 0.66 BLEU score absolute as well as 0.26% in F1-measure over the baseline of phrase-based model in NE test set.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123634333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Using prior knowledge to assess relevance in speech summarization 运用先验知识评估语音摘要的相关性
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777867
Ricardo Ribeiro, David Martins de Matos
{"title":"Using prior knowledge to assess relevance in speech summarization","authors":"Ricardo Ribeiro, David Martins de Matos","doi":"10.1109/SLT.2008.4777867","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777867","url":null,"abstract":"We explore the use of topic-based automatically acquired prior knowledge in speech summarization, assessing its influence throughout several term weighting schemes. All information is combined using latent semantic analysis as a core procedure to compute the relevance of the sentence-like units of the given input source. Evaluation is performed using the self-information measure, which tries to capture the informativeness of the summary in relation to the summarized input source. The similarity of the output summaries of the several approaches is also analyzed.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124939076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Methods for improving the quality of syllable based speech synthesis 提高基于音节的语音合成质量的方法
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777832
Y. R. Venugopalakrishna, M. V. Vinodh, H. Murthy, C. S. Ramalingam
{"title":"Methods for improving the quality of syllable based speech synthesis","authors":"Y. R. Venugopalakrishna, M. V. Vinodh, H. Murthy, C. S. Ramalingam","doi":"10.1109/SLT.2008.4777832","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777832","url":null,"abstract":"Our earlier work [1] on speech synthesis has shown that syllables can produce reasonably natural quality speech. Nevertheless, audible artifacts are present due to discontinuities in pitch, energy, and formant trajectories at the joining point of the units. In this paper, we present some minimal signal modification techniques for reducing these artifacts.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127221713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Adaptive filtering for high quality hmm based speech synthesis 基于hmm的高质量语音合成自适应滤波
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777835
L. Coelho, D. Braga
{"title":"Adaptive filtering for high quality hmm based speech synthesis","authors":"L. Coelho, D. Braga","doi":"10.1109/SLT.2008.4777835","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777835","url":null,"abstract":"In this work an adaptive filtering scheme based on a dual Discrete Kalman Filtering (DKF) is proposed for Hidden Markov Model (HMM) based speech synthesis quality enhancement. The objective is to improve signal smoothness across HMMs and their related states and to reduce artifacts due to acoustic model's limitations. Both speech and artifacts are modelled by an autoregressive structure which provides an underlying time frame dependency and improves time-frequency resolution. Themodel parameters are arranged to obtain a combined state-space model and are also used to calculate instantaneous power spectral density estimates. The quality enhancement is performed by a dual discrete Kalman filter that simultaneously gives estimates for the models and the signals. The system's performance has been evaluated using mean opinion score tests and the proposed technique has led to improved results.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127482537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving word segmentation for Thai speech translation 改进泰语语音翻译的分词方法
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777885
Paisarn Charoenpornsawat, Tanja Schultz
{"title":"Improving word segmentation for Thai speech translation","authors":"Paisarn Charoenpornsawat, Tanja Schultz","doi":"10.1109/SLT.2008.4777885","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777885","url":null,"abstract":"A vocabulary list and language model are primary components in a speech translation system. Generating both from plain text is a straightforward task for English. However, it is quite challenging for Chinese, Japanese, or Thai which provide no word segmentation, i.e. the text has no word boundary delimiter. For Thai word segmentation, maximal matching, a lexicon-based approach, is one of the popular methods. Nevertheless this method heavily relies on the coverage of the lexicon. When text contains an unknown word, this method usually produces a wrong boundary. When extracting words from this segmented text, some words will not be retrieved because of wrong segmentation. In this paper, we propose statistical techniques to tackle this problem. Based on different word segmentation methods we develop various speech translation systems and show that the proposed method can significantly improve the translation accuracy by about 6.42% BLEU points compared to the baseline system.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114588774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Corpus-based synthesis of Mandarin speech with F0 contours generated by superposing tone components on rule-generated phrase components 基于语料库的基于规则生成的短语成分叠加音调成分生成F0轮廓的汉语语音合成
2008 IEEE Spoken Language Technology Workshop Pub Date : 2008-12-01 DOI: 10.1109/SLT.2008.4777833
K. Hirose, Qinghua Sun, N. Minematsu
{"title":"Corpus-based synthesis of Mandarin speech with F0 contours generated by superposing tone components on rule-generated phrase components","authors":"K. Hirose, Qinghua Sun, N. Minematsu","doi":"10.1109/SLT.2008.4777833","DOIUrl":"https://doi.org/10.1109/SLT.2008.4777833","url":null,"abstract":"Mandarin speech synthesis was conducted by generating prosodic features by the proposed method and segmental features by HMM-based method. The proposed method generates sentence fundamental frequency (F0) contours by representing them as a superposition of tone components on phrase components. The tone components are realized by concatenating their fragments at tone nuclei predicted by a corpus-based method, while the phrase components are generated by rules under the generation process model (F0 model) framework. The method includes prediction of phoneme/pause durations in a statistical method as the first step. Through a listening test on the quality of synthetic speech, it was shown that a better quality was obtainable by the method as compared to that by the full HMM-based method. It was also shown that a better quality is obtainable as compared to the case of generating F0 contours without super-positional scheme.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128849050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信