{"title":"适应谷歌翻译的英语-波斯语跨语种信息检索在医学领域","authors":"Amin Rahmani","doi":"10.1109/AISP.2017.8324104","DOIUrl":null,"url":null,"abstract":"Cross-lingual information retrieval (CLIR) systems enable users to search and find their information needs from sources written in languages other than the user's native language. Generally, these systems assist users to overcome the language barrier problem. Although, several techniques are used to develop such systems, query translation method has absorbed much attention due to its performance. In this paper, the author suggested a new approach for English-Persian CLIR. To do this, Google Translate's API was adapted for CLIR system to translate the queries. Using TREC dataset, 50 queries were selected to evaluate the system. Both English queries and their Persian equivalents were searched in RICeST's English and Persian E-articles databases. As black box evaluation, the researcher utilized 11 point interpolated average precision metric to gain the average precision (AP) score for each query after which the mean average precision measure (MAP) scores for English and Persian queries were calculated. The MAP score for monolingual and cross-lingual systems were 0.421 and 0.382 respectively. As glass box evaluation, the machine translation system's performance was measured based on the BLEU automatic metric. According to the results of this study, 90% similarity in IR was observed between the CLIR and the monolingual systems. The new approach was ideally suited for English and Persian CLIR task.","PeriodicalId":386952,"journal":{"name":"2017 Artificial Intelligence and Signal Processing Conference (AISP)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Adapting google translate for English-Persian cross-lingual information retrieval in medical domain\",\"authors\":\"Amin Rahmani\",\"doi\":\"10.1109/AISP.2017.8324104\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cross-lingual information retrieval (CLIR) systems enable users to search and find their information needs from sources written in languages other than the user's native language. Generally, these systems assist users to overcome the language barrier problem. Although, several techniques are used to develop such systems, query translation method has absorbed much attention due to its performance. In this paper, the author suggested a new approach for English-Persian CLIR. To do this, Google Translate's API was adapted for CLIR system to translate the queries. Using TREC dataset, 50 queries were selected to evaluate the system. Both English queries and their Persian equivalents were searched in RICeST's English and Persian E-articles databases. As black box evaluation, the researcher utilized 11 point interpolated average precision metric to gain the average precision (AP) score for each query after which the mean average precision measure (MAP) scores for English and Persian queries were calculated. The MAP score for monolingual and cross-lingual systems were 0.421 and 0.382 respectively. As glass box evaluation, the machine translation system's performance was measured based on the BLEU automatic metric. According to the results of this study, 90% similarity in IR was observed between the CLIR and the monolingual systems. The new approach was ideally suited for English and Persian CLIR task.\",\"PeriodicalId\":386952,\"journal\":{\"name\":\"2017 Artificial Intelligence and Signal Processing Conference (AISP)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 Artificial Intelligence and Signal Processing Conference (AISP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AISP.2017.8324104\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Artificial Intelligence and Signal Processing Conference (AISP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AISP.2017.8324104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Adapting google translate for English-Persian cross-lingual information retrieval in medical domain
Cross-lingual information retrieval (CLIR) systems enable users to search and find their information needs from sources written in languages other than the user's native language. Generally, these systems assist users to overcome the language barrier problem. Although, several techniques are used to develop such systems, query translation method has absorbed much attention due to its performance. In this paper, the author suggested a new approach for English-Persian CLIR. To do this, Google Translate's API was adapted for CLIR system to translate the queries. Using TREC dataset, 50 queries were selected to evaluate the system. Both English queries and their Persian equivalents were searched in RICeST's English and Persian E-articles databases. As black box evaluation, the researcher utilized 11 point interpolated average precision metric to gain the average precision (AP) score for each query after which the mean average precision measure (MAP) scores for English and Persian queries were calculated. The MAP score for monolingual and cross-lingual systems were 0.421 and 0.382 respectively. As glass box evaluation, the machine translation system's performance was measured based on the BLEU automatic metric. According to the results of this study, 90% similarity in IR was observed between the CLIR and the monolingual systems. The new approach was ideally suited for English and Persian CLIR task.