Xinyan Xiao, Jinsong Su, Yang Liu, Qun Liu, Shouxun Lin
{"title":"An Orientation Model for Hierarchical Phrase-Based Translation","authors":"Xinyan Xiao, Jinsong Su, Yang Liu, Qun Liu, Shouxun Lin","doi":"10.1109/IALP.2011.43","DOIUrl":"https://doi.org/10.1109/IALP.2011.43","url":null,"abstract":"The hierarchical phrase-based (HPB) translation exploits the power of grammar to perform long distance reorderings, without specifying nonterminal orientations against adjacent blocks or considering the lexical information covered by nonterminals. In this paper, we borrow from phrase-based system the idea of orientation model to enhance the reordering ability of HPB translation. We distinguish three orientations (monotone, swap, discontinuous) of a nonterminal based on the alignment of grammar, and select the appropriate orientation of nonterminal using lexical information covered by it. By incorporating the orientation model, our approach significantly outperforms a standard HPB system up to 1.02 Bleu on large scale NIST Chinese-English translation task, and 0.51 Bleu on WMT German-English translation task.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121901655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discourse Structures of English Exposition","authors":"Donghong Liu, Meizhen Liao","doi":"10.1109/IALP.2011.51","DOIUrl":"https://doi.org/10.1109/IALP.2011.51","url":null,"abstract":"Van Kuppevelt's approach to discourse structure emphasizes that topicality is the general organizing principle. His discourse structure consisting of MAIN STRUCTURE and SIDE STRUCTURE has been applied to conversations and discourse segmentations rather than expository essays. In this paper two discourse structures of expository essays are proposed in Van Kuppevelt's framework. The proposed structure can test the soundness of the ways of developing expository essays.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117196230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Development of Acoustic Space in 3 to 5 Years Old Hindi Speaking Children","authors":"Vaishna Narang, Garima Dalal, Deepshikha Misra","doi":"10.1109/IALP.2011.41","DOIUrl":"https://doi.org/10.1109/IALP.2011.41","url":null,"abstract":"Many studies have described the acoustics of speech focusing on the speech of adults, but only a few have analyzed children's speech. Studies on development of language in children often include their articulatory speech patterns but the process of speech development is only partially understood. This research focuses on the development of Acoustic/Vowel space in Hindi speaking children. The study assumes that the acoustic space is continuously being redefined and modified in order to achieve and maintain a certain perceptual contrast and attempts to explore how acoustic space develops in children from three to five years of age.An acoustic study of seven peripheral vowels of Hindi, this study uses the first two formants of the vowels to arrive at a graphic representation of the acoustic space for peripheral vowels as articulated by the subjects under study. The area of the vowel/ acoustic space is then calculated using Irregular Polygon Area Calculator. The development of acoustic space in six Hindi speaking children from 3 to 5 years of age shows interesting results which are presented in this paper.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115244942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Asef Poormasoomi, M. Kahani, Saeed Varasteh Yazdi, Hossein Kamyar
{"title":"Context-Based Persian Multi-document Summarization (Global View)","authors":"Asef Poormasoomi, M. Kahani, Saeed Varasteh Yazdi, Hossein Kamyar","doi":"10.1109/IALP.2011.53","DOIUrl":"https://doi.org/10.1109/IALP.2011.53","url":null,"abstract":"Multi-document summarization is the automatic extraction of information from multiple documents of the same topic. This paper proposes a new method, using LSA, for extracting the global context of a topic and removes sentence redundancy using SRL and WordNet semantic similarity for Persian language. In the previous approaches, the focus was on the sentence features (local view) as the main and basic unit of text. In this paper, the sentences are selected based on the main context hidden in the all documents of a topic. The experimental results show that our proposed method outperforms other Persian multi-document systems.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122610375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using HTML Tags to Improve Parallel Resources Extraction","authors":"Yanhui Feng, Yu Hong, Wei Tang, Jianmin Yao, Qiaoming Zhu","doi":"10.1109/IALP.2011.23","DOIUrl":"https://doi.org/10.1109/IALP.2011.23","url":null,"abstract":"This paper proposes a new approach to extract parallel resources (including bilingual sentences and bilingual terms) from bilingual web pages, which have a primary language and a secondary language (the second language is often the translation to primary language). Our method is composed of four tasks: 1) parsing the web page into a DOM tree and segmenting inner texts of each node into series of monolingual snippets; 2) selecting adjacent snippet pairs in different languages and with higher translation scores as seeds for the next task; 3) constructing comprehensive wrappers from selected seeds, which save both HTML and surface formatting styles; 4) mining candidate instances and selecting good instances by their similarities with seeds. In this paper, we first propose to segment text by HTML tags, and select potential parallel resources by ranking all extracted candidates. According to the experimental results, our method can be applied to bilingual pages written in any other pair of languages. Experimental results also show that our approaches are effective in improving the parallel resources extraction.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123166737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applying Grapheme, Word, and Syllable Information for Language Identification in Code Switching Sentences","authors":"Y. Yeong, T. Tan","doi":"10.1109/IALP.2011.34","DOIUrl":"https://doi.org/10.1109/IALP.2011.34","url":null,"abstract":"In this paper, we propose an automatic language identification approach for code switching sentences by using the morphological structures and sequence of the syllable. The approach was tested on Malay-English code switching sentences. The proposed language identification approach achieves 90.75% in term of accuracy on the vocabularies. Our approach was further improved by combining the knowledge from other level in the sentence: word and alphabet. The additional information further improves the accuracy of our language identification method to 96.36%.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131480606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linear Regression for Prosody Prediction via Convex Optimization","authors":"Ling Cen, M. Dong, P. Chan","doi":"10.1109/IALP.2011.75","DOIUrl":"https://doi.org/10.1109/IALP.2011.75","url":null,"abstract":"In this paper, a L1 regularized linear regression based method is proposed to model the relationship between the linguistic features and prosodic parameters in Text-to-Speech (TTS) synthesis. By formulating prosodic prediction as a convex problem, it can be solved using very efficient numerical method. The performance can be similar to that of the Classification and Regression Tree (CART), a widely used approach for prosodic prediction. However, the computational load can be as low as 76% of that required by CART.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133347300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WordNet Editor to Refine Indonesian Language Lexical Database","authors":"Gunawan, J. Wijoyo, I. K. E. Purnama, M. Hariadi","doi":"10.1109/IALP.2011.59","DOIUrl":"https://doi.org/10.1109/IALP.2011.59","url":null,"abstract":"This paper describes an approach for editing Indonesian Language Lexical Database especially noun category and its relations. The purpose of this editor is to refine Indonesian Lexical Database that was developed in our previous researches. The visualization of the editor is using graph library with some modifications and additions. Furthermore, this editor will be web based so that everyone can participate to improve Indonesian Language Lexical Database. There is an administrator role that had to accept or reject any suggestion for the changes suggested by any member. We believe that this editing approach can also be used to improve WordNet developed in other languages.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131350631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Context Imperative Sentences of Modern Chinese","authors":"Hao Zhao, Kaihong Yang","doi":"10.1109/IALP.2011.48","DOIUrl":"https://doi.org/10.1109/IALP.2011.48","url":null,"abstract":"The present study of the imperative-expression category in modern Chinese mainly focuses on form the mood imperative sentences in general meaning. Besides the mood imperative sentences, the imperative expression also concludes those sentences that have imperative functions in context. These sentences can be divided into the omissive-imperative sentences and the zero-imperative sentences according to the explicitness of the imperative commanding. This paper mainly examines the types and the information transmitting characteristics of the two different imperative sentences.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116369124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Duc-Trong Le, Mai-Vu Tran, Tri-Thanh Nguyen, Quang-Thuy Ha
{"title":"Co-reference Resolution in Vietnamese Documents Based on Support Vector Machines","authors":"Duc-Trong Le, Mai-Vu Tran, Tri-Thanh Nguyen, Quang-Thuy Ha","doi":"10.1109/IALP.2011.63","DOIUrl":"https://doi.org/10.1109/IALP.2011.63","url":null,"abstract":"Co-reference resolution task still poses many challenges due to the complexity of the Vietnamese language, and the lack of standard Vietnamese linguistic resources. Based on the mention-pair model of Rahman and Ng. (2009) and the characteristics of Vietnamese, this paper proposes a model using support vector machines (SVM) to solve the co-reference in Vietnamese documents. The corpus used in experiments to evaluate the proposed model was constructed from 200 articles in cultural and social categories from vnexpress.net newspaper website. The results of the initial experiments of the proposed model achieved 76.51% accuracy in comparison with that of the baseline model of 73.79% with similar features.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"64 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120982940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}