{"title":"Word Reordering Alignment for Combination of Statistical Machine Translation Systems","authors":"Maoxi Li, Chengqing Zong","doi":"10.1109/CHINSL.2008.ECP.80","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.80","url":null,"abstract":"Word alignment is a basic and critical process in the Statistical Machine Translation (SMT). The previous work on word alignment mainly focuses on the training process to get the word mapping relation between the source sentences and target sentences. However, the word alignment for combination of SMT system outputs is also important, which aims to find the word correspondence between alternative translation hypotheses of a source language sentence. Unfortunately, it does not attract so much attention in SMT research. In this paper, we propose a novel word alignment approach to effectively address the word alignment between sentences with different valid word orders, which changes the order of the word sequences (called word reordering) of the output hypotheses to make the word order more exactly match the alignment reference. We present experimental results on the IWSLT'2008 challenge tasks with the combination of four state-of-the-art SMT systems outputs. The results show that our approach significantly improves the performance of the system combination.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133712499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hui Yin, C. Nadeu, V. Hohmann, Xiang Xie, Jingming Kuang
{"title":"Order Adaptation of the Fractional Fourier Transform Using the Intraframe Pitch Change Rate for Speech Recognition","authors":"Hui Yin, C. Nadeu, V. Hohmann, Xiang Xie, Jingming Kuang","doi":"10.1109/CHINSL.2008.ECP.60","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.60","url":null,"abstract":"We propose an acoustic feature for speech recognition based on the combination of MFCC and fractional Fourier transform (FrFT). The transform orders for FrFT are adaptively set according to the intraframe pitch change rate. This method is motivated by the fact that the speech is not stationary even in a short period of time, and the idea is shown using an AM-FM speech model and some spectrograms of an artificial periodic signal. Experiments were conducted on the intervocalic English consonants provided by Interspeech 2008 Consonant Challenge and a Mandarin connected digits corpus. The performance of the proposed method is compared with the MFCC baseline system. Experimental results show that the proposed features get a slightly better recognition rate than MFCCs presumably because they can better track the dynamic characteristics of the speech harmonics.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133521453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prosodic Modeling for Isolated Mandarin Words and its Application","authors":"Hung-Kuang Shih, Chen-Yu Chiang, Yih-Ru Wang, Sin-Horng Chen","doi":"10.1109/CHINSL.2008.ECP.70","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.70","url":null,"abstract":"In this paper, a new approach to syllable-based modeling of FO contour, duration and energy for isolated Mandarin words is proposed. The syllable FO contour model considers three major affecting factors, including lexical tone, syllable position in a word and inter-syllable coarticulation effect; while both the duration and energy models additionally consider one more affecting factor of base syllable type. Experimental results on a large single-speaker database showed that the method performed very well. Based on the prosodic model, a learning system for Mandarin word prosody pronunciation is designed and implemented for normative speakers.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125092058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting and Tagging Dialog-Act Using MDP and SVM","authors":"Keyan Zhou, Chengqing Zong, Hua Wu, Haifeng Wang","doi":"10.1109/CHINSL.2008.ECP.85","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.85","url":null,"abstract":"Dialog-act tagging is one of the hot topics in processing human-human conversation. In this paper, we introduce a novel model to predict and tag the dialog-act, in which Markov decision process (MDP) is utilized to predict the dialog-act sequence instead of using traditional dialog-act based n-gram, and Support Vector Machine (SVM) is employed to classify the dialog-act for each utterance. The predicting result of MDP and the classifying result of SVM are integrated as the final tagging. The experimental results have shown that our approach outperforms the traditional method.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123616749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large Vocabulary Continuous Speech Recognition in Uyghur: Data Preparation and Experimental Results","authors":"Nasirjan Tursun, Wushour Silamu","doi":"10.1109/CHINSL.2008.ECP.61","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.61","url":null,"abstract":"Uyghur language is an agglutinative language. It is one of the least studied languages on speech recognition area. In this work, we present the research process of Uyghur large vocabulary continuous speech recognition based on HMM (hidden Markov model). This paper introduce the process of data collection (text corpus and speech corpus), the unit selection for speech recognition, the creation of acoustic and language model for Uyghur language. Also presents the experimental results of Uyghur continuous speech recognition in different recognition units.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131211668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CityBrowser II: A Multimodal Restaurant Guide in Mandarin","authors":"Jingjing Liu, Yushi Xu, S. Seneff, V. Zue","doi":"10.1109/CHINSL.2008.ECP.50","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.50","url":null,"abstract":"In this paper we present a conversational dialogue system, CityBrowser II, which allows users to inquire about information about restaurants in Mandarin. Developed in the Galaxy infrastructure with a common, language-independent semantic representation, CityBrowser integrates portability and scalability. By inheriting the infrastructure and main language understanding/generation components from its English predecessor, CityBrowser can easily be transformed to a Mandarin language environment. This paper describes our system implementation, focusing on the language- specific modifications to the original English system. We show that our language-independent yet scalable system infrastructure makes multilingualism a promising task.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127894671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reference Eigen-Environment and Speaker Weighting for Robust Speech Recognition","authors":"Y. Liao, Hung-Hsiang Fang, C. Yang","doi":"10.1109/CHINSL.2008.ECP.31","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.31","url":null,"abstract":"In this paper a reference eigen-environment and speaker weighting (RESW) method is proposed for online HMM adaptation. RESW establishes multiple eigen-MLLR subspaces as the set of a priori knowledge according to certain affecting factors, such as noise type, SNR, male and female. It then projects an input test utterance simultaneously into the set of eigen-subspaces and optimally synthesizes out a set of suitable HMMs. The proposed RESW was evaluated on Aurora 2 multi- condition training task. Experimental results showed that average word error rate (WER) of 6.11% was achieved. RESW not only outperformed the multi-condition training baseline (Multi-Con., 13.72%) but also the blind ETSI advanced DSR front-end (ETSI-Adv., 8.65%) and the histogram equalization (HEQ, 8.66%) and the non-blind reference model weighting (RMW, 7.29%) and Eigen-MLLR (6.14%) approaches.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125878491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Adaptation Schemes In PR-SVM Based Language Recognition","authors":"Xu Bing, Yan Song, Lirong Dai","doi":"10.1109/CHINSL.2008.ECP.95","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.95","url":null,"abstract":"Phonetic-based systems usually convert the input speech into token (i.e. word, phone etc.) sequence and determine the target language from the statistics of the token sequences on different languages. Generally, there are two kinds of statistical representation for token sequences, N-gram language model (PR-LM) and support vector machines (PR- SVM) to perform language classification. In this paper we focus on PR-SVM method. One problem of the PR-SVM is that the statistical representation based on utterance is sparse and inaccurate. To tackle this issue, the adaptation schemes in PR-SVM framework are proposed in this paper. There are two schemes to be used: 1) Adaptation from the Universal N-gram Language Model (UNLM) trained on all languages; 2) Adaptation from the Low-Order N-gram Language Model (LONLM). The experimental results on 2007 NIST LRE tasks show that our method achieves significant gains over the unadapted model.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123172380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Pseudo-Key for Language Recognition System Design","authors":"Hanwu Sun, B. Ma, Haizhou Li","doi":"10.1109/CHINSL.2008.ECP.55","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.55","url":null,"abstract":"In this paper, we present a novel pseudo-key analysis approach for the fusion system of language recognition. The state-of-the-art language recognition systems for the NIST language recognition evaluation (LRE) commonly consist of multiple language classifiers. To avoid the fusion system to be spoiled by one abnormal classifier, pseudo keys are designed to check the integrity of each of the individual classifiers before the system fusion. The scores of individual classifiers are cross-validated based the pseudo keys. The language recognition experiments are conducted on the 2007 NIST LRE corpus based on the Institute for Infocomm Research's submission.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117353333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intonational Prominence of \"SHI...(DE)\" Construction in Standard Chinese","authors":"Yuan Jia, Ai-jun Li, Ziyu Xiong","doi":"10.1109/CHINSL.2008.ECP.27","DOIUrl":"https://doi.org/10.1109/CHINSL.2008.ECP.27","url":null,"abstract":"The present study mainly deals with the phonetic realization of the intonational prominence in the shi...(de) construction in standard Chinese. Results of acoustic and perceptual experiments demonstrate that the prominence placement bears corresponding relationship with the focused constituents marked by shi...(de) structure, specifically, the appearance of intonational prominence is symbolized by the focus marker shi. The phonetic realization of the intonational prominence lies in the expansion of the pitch range of the focus-bearing constituent and the compression of the pitch registers of the successive syllables.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130302399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}