{"title":"Automatic extraction of bilingual terms from a Chinese-Japanese parallel corpus","authors":"Xiaorong Fan, N. Shimizu, Hiroshi Nakagawa","doi":"10.1145/1667780.1667789","DOIUrl":null,"url":null,"abstract":"This paper proposes a new approach for the automatic extraction of bilingual terms from a domain-specific bilingual parallel corpus. We combine existing monolingual term extractor and a word alignment tool to extract bilingual terms. Our method is different from those past studies as we simply use a word alignment tool to extract multi-words terms, and we use one monolingual term extractor for both of languages to reduce extraction imbalance. We obtained a good precision and an improved BLEU score in our experiment based on a Chinese-Japanese parallel corpus.","PeriodicalId":103128,"journal":{"name":"Proceedings of the 3rd International Universal Communication Symposium","volume":"473 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Universal Communication Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1667780.1667789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
This paper proposes a new approach for the automatic extraction of bilingual terms from a domain-specific bilingual parallel corpus. We combine existing monolingual term extractor and a word alignment tool to extract bilingual terms. Our method is different from those past studies as we simply use a word alignment tool to extract multi-words terms, and we use one monolingual term extractor for both of languages to reduce extraction imbalance. We obtained a good precision and an improved BLEU score in our experiment based on a Chinese-Japanese parallel corpus.