{"title":"Analysis and Evaluation of Terminology Translation Consistency in Scientific and Technical Literature","authors":"Baosheng Yin, Xiaodong Yue, Dongfeng Cai, Guiping Zhang","doi":"10.1109/IALP.2013.25","DOIUrl":"https://doi.org/10.1109/IALP.2013.25","url":null,"abstract":"In large-scale scientific and technical literature translation in which many people are involved, inconsistency in the translation of the same terminology is inevitable. Firstly, this paper carried out a comprehensive analysis to terminology translation inconsistency, finding that most are translations with same meaning but different indications, which influences the readability of the whole article. Then, we put forward a semantic similarity-based calculation method to identify this category of terminology inconsistency, selected translation with high frequency of network using Internet search engines, and carried out unionization. Finally, evaluate the literature improvement (post-editing) with two indexes of precision and consistency of terminology translation. In the experimental analysis, 100000-word patent document (English-Chinese human translation) is selected. The consistency index of original translation is 0.494, the consistency index of the processed translation is 0.763. The experimental result indicated that the method effectively improves the terminology translation consistency on the premise of correctly replacing terminology, thus significantly improves the readability of translation.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133258705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Purely Monotonic Approach to Machine Translation for Similar Languages","authors":"Ye Kyaw Thu, A. Finch, E. Sumita, Y. Sagisaka","doi":"10.1109/IALP.2013.31","DOIUrl":"https://doi.org/10.1109/IALP.2013.31","url":null,"abstract":"This paper investigates the effect of taking a strictly monotonic approach to machine translation for a restricted set of suitable language pairs. We studied the effect of decoding monotonically for a set of language pairs which has similar word order characteristics and found that for some language pairs - namely language pairs where both languages are in SOV order - there was almost no difference in machine translation quality. The results of this experiment motivated the extension of the monotonic approach into the alignment stage of the training. We used a Bayesian non-parametric aligner that has been shown to out-perform GIZA++ in combination with the grow-diag-final- and heuristic on transliteration data. Our results show that the monotonic aligner was able to match the performance of the GIZA++ baseline, and gains in translation performance were obtained by integrating both aligners into the systems.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115625063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NLP-Oriented Study on the Imperative Sentence with Interrogative Mood","authors":"Pu Li, Hao Zhao","doi":"10.1109/IALP.2013.18","DOIUrl":"https://doi.org/10.1109/IALP.2013.18","url":null,"abstract":"Imperative sentence with interrogative mood (ISIM) is a transitional category between imperative sentence and interrogative sentence. It has both the form of interrogative sentence and the function to transfer information of imperative sentence. So, the classification and meaning of ISIM become argumentative in study and application in Natural Language Processing. In this paper, the author claims three points in this paper: first, there are two parts of ISIM: imperative center which transfers imperative information and interrogative structure which makes the whole sentence more euphemistic, second, there are four types of ISIM in form: attached imperative sentence, imperative sentence of positive and negative, imperative sentence of right and wrong, rhetorical imperative sentence, third, the semantic and types of interrogative structure will affect the semantic of whole sentence.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116712760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tibetan Word Segmentation Based on Word-Position Tagging","authors":"Caijun Kang, Di Jiang, Congjun Long","doi":"10.1109/IALP.2013.74","DOIUrl":"https://doi.org/10.1109/IALP.2013.74","url":null,"abstract":"The best advantage of Tibetan word segmentation based on word-position is to reduce segmentation errors for unknown words. In this article authors upgrade usual 4-tag set to 6-tag set to fit in with the features of Tibetan characters, using CRF as tagging model to train and test corpus data, then building post processing modules to revise the result data. The experimental result shows that this method achieves a good performance and deserves further study, including expanding the corpus and optimizing the tag set and feature templates.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123316536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data Quality Controlling for Cross-Lingual Sentiment Classification","authors":"Shoushan Li, Yunxia Xue, Zhongqing Wang, Sophia Yat-Mei Lee, Chu-Ren Huang","doi":"10.1109/IALP.2013.43","DOIUrl":"https://doi.org/10.1109/IALP.2013.43","url":null,"abstract":"Cross-lingual sentiment classification aims to perform sentiment classification in a language (named as the target language) with the help of the resources from another language (named as the source language). Previous studies are prone to using all available data in the source language while using all data is observed to perform no better or even worse than using a partion of good data. In this paper, we propose a novel task called data quality controlling in the source language to select high quality samples from the source language. To tackle this task, we propose two kinds of data quality measurements: intra- and extra-quality measurements which are implemented with the certainty and similarity measurements respectively. The empirical studies demonstrate the effectiveness of the proposed approach to data quality controlling in the source language.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126426995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating the Difficulty of Concepts on Domain Knowledge Using Latent Semantic Analysis","authors":"Tao-Hsing Chang, Y. Sung, Yao-Tung Lee","doi":"10.1109/IALP.2013.58","DOIUrl":"https://doi.org/10.1109/IALP.2013.58","url":null,"abstract":"As for the field of educational research and its applications, evaluating concept difficulty is necessary but difficult to carry out the work. Two major approaches are employed in previous research to evaluate the difficulty of concepts. Both of two approaches do not take into account whether concepts are acquired by learners and readers or not. This paper will be focused on constructing basic concept list of domain knowledge with latent semantic analysis (LSA), and use the age of acquisition of concept for representing the difficulty of concept. This paper will utilize natural science texts of elementary school in Taiwan as experimental materials to verify the validity of using our proposed method for evaluating the difficulty of concept.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116393406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discourse Topic in Anaphora Resolution and Discourse Construction","authors":"Donghong Liu","doi":"10.1109/IALP.2013.69","DOIUrl":"https://doi.org/10.1109/IALP.2013.69","url":null,"abstract":"Performing the function of encapsulating the whole discourse, discourse topic is usually considered as the center of the discourse. However, the representation form of discourse topic has not been agreed upon due to the elusive nature of the notion per se. Some people view discourse topic as an entity; others regard it as a question; still others as a proposition or even as an unnecessary form. This paper points out that discourse entity cannot be used in event anaphora if it is considered as discourse topic; that if a discourse topic takes the form of question, anaphora resolution may not be realized because of the uncertainty in the question and in the components of the question; and that if a discourse topic takes null form, the global coherence and the relevance of a discourse might be undermined. The paper also proves that comparatively propositional discourse topics conform to human beings' cognitive psychology, contribute to event anaphora resolution and facilitate discourse construction.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124055708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Acoustic Research of Stops CV Structure Coarticulation in Amdo Tibetan Xiahe Dialect","authors":"Shiliang Lv, Yasheng Jin, Ning Ma, Xuechen Yin","doi":"10.1109/IALP.2013.51","DOIUrl":"https://doi.org/10.1109/IALP.2013.51","url":null,"abstract":"This paper studies on the Stops CV Structure in Amdo Tibetan's Xiahe Dialect. This study uses the research methods of speech acoustics, extraction of speech formant parameters. To research the CV structure Intra-Syllable Coarticulation. The main results of the research were as follows: In Xiahe Dialect CV structure exist coarticulation effects between consonants and vowels. The consonant made a great impact on vowel, relating mainly to the place of articulation of consonants. Consonant initials also influenced by the vowel. Through the locus slope of consonant, we found that place of articulation at back of consonants were affected most, unaspirated sound is greater than the aspirated sound.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"220 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124477289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining Semantic Orientation of Multiword Expression from Chinese Microblogging with Discriminative Latent Model","authors":"Xiao Sun, Chengcheng Li, Chenyi Tang, F. Ren","doi":"10.1109/IALP.2013.41","DOIUrl":"https://doi.org/10.1109/IALP.2013.41","url":null,"abstract":"Extracting semantic orientation of Multiword Expression, especially some newly generated Multiword Expression from internet, is an important task for sentiment analysis of web texts or other real word text as some Multiword Expressions can express more integrative sentiments than words units. This paper proposes a method contains a novel latent discriminative algorithm, which attempts to attack this problem by integrating discriminative model and latent value model. Although Chinese Multiword Expressions consist of multiple words, the semantic orientation of the Multiword Expression is not just simple integration of orientations of the component words, as some words can invert the affective orientation so the Multiword Expressions can have totally opposite semantic orientation. In order to capture the property of such Multiword Expressions, hidden semi-CRF which includes a latent valuable layer, which can be used to address dual-sequence labeling tasks synchronously, is adopted. The method is tested experimentally by adopting a manually labeled set of positive and negative Multiword Expressions from microblog or other internet resources, and the experiments have shown very promising results, which is comparable to the best value ever reported.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131948150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haitao Xing, Muyun Yang, Haoliang Qi, Sheng Li, T. Zhao
{"title":"Mining Parallel Corpus from Sina Microblog","authors":"Haitao Xing, Muyun Yang, Haoliang Qi, Sheng Li, T. Zhao","doi":"10.1109/IALP.2013.29","DOIUrl":"https://doi.org/10.1109/IALP.2013.29","url":null,"abstract":"Finding the parallel corpus as a kind of specific type of information from microblogging sites with millions of users, such as Sina Microblog, is a challenging task. This paper investigates the feasibility of mining such data from the username, the hash tag as well as the user relations by three different methods. The initial experiment is encouraging under the current restriction of limited microblog content access.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131985929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}