{"title":"Effects of Comparable Corpora on Cross-language Information Retrieval","authors":"F. Sadat","doi":"10.5220/0003029200530059","DOIUrl":null,"url":null,"abstract":"This paper seeks to present an approach to learning bilingual terminology from scarce resources in order to translate and expand terms from source language to target language and possibly retrieve documents across languages. An extracted bilingual lexicon from comparable corpora will provide a valuable resource to enrich existing bilingual dictionaries and thesauri. A linear combination involving the extracted bilingual terminology from comparable corpora, readily available bilingual dictionaries and transliteration is proposed to Cross-Language Information Retrieval. An application on Japanese-English language pair of languages shows that the proposed combination yields better translations and an effectiveness of information retrieval could be achieved across languages.","PeriodicalId":378427,"journal":{"name":"International Workshop on Natural Language Processing and Cognitive Science","volume":"215 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Natural Language Processing and Cognitive Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0003029200530059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper seeks to present an approach to learning bilingual terminology from scarce resources in order to translate and expand terms from source language to target language and possibly retrieve documents across languages. An extracted bilingual lexicon from comparable corpora will provide a valuable resource to enrich existing bilingual dictionaries and thesauri. A linear combination involving the extracted bilingual terminology from comparable corpora, readily available bilingual dictionaries and transliteration is proposed to Cross-Language Information Retrieval. An application on Japanese-English language pair of languages shows that the proposed combination yields better translations and an effectiveness of information retrieval could be achieved across languages.