Word translation for Indo-Aryan languages using different retrieval techniques

Kiranjeet Kaur, S. Chauhan
{"title":"Word translation for Indo-Aryan languages using different retrieval techniques","authors":"Kiranjeet Kaur, S. Chauhan","doi":"10.32629/jai.v7i4.1455","DOIUrl":null,"url":null,"abstract":"The study of Natural Language Processing has been revolutionized by word embedding, enabling advanced language models to understand and generate human-like text. In this research article, we delve deep into the world of word embedding, aiming to provide a comprehensive exploration of its underlying principles, methodologies, and applications. One important factor that affects many multilingual language processing activities is the word translation or incorporation of bilingual dictionaries. We used bilingual dictionaries or parallel data for translation from one language to another. For this research work, this problem is addressed, and also generating the best cross-lingual word embedding for the different language pairs. So, we are using an aligned document sentence-aligned corpus, or any bilingual dictionary for this research analysis. For the most frequent word, we are assuming that there is an intra-lingual similarity distribution, and both the source and the target corpora have a comparable distribution graph. Additionally, these embeddings are isometric. These cross-lingual word embeddings are used for cross-lingual transfer learning and unsupervised neural machine translation. This research aims to improve the accuracy and efficiency of word translation between different language pairs by employing different retrieval techniques. The study analyzes the effectiveness of these techniques on different language pairs, including English-Hindi, English-Punjabi, English-Gujarati, English-Bengali, and English-Marathi. The research is expected to contribute significantly to the field of language translation by introducing innovative methods and other applications.","PeriodicalId":508223,"journal":{"name":"Journal of Autonomous Intelligence","volume":"66 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Autonomous Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32629/jai.v7i4.1455","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The study of Natural Language Processing has been revolutionized by word embedding, enabling advanced language models to understand and generate human-like text. In this research article, we delve deep into the world of word embedding, aiming to provide a comprehensive exploration of its underlying principles, methodologies, and applications. One important factor that affects many multilingual language processing activities is the word translation or incorporation of bilingual dictionaries. We used bilingual dictionaries or parallel data for translation from one language to another. For this research work, this problem is addressed, and also generating the best cross-lingual word embedding for the different language pairs. So, we are using an aligned document sentence-aligned corpus, or any bilingual dictionary for this research analysis. For the most frequent word, we are assuming that there is an intra-lingual similarity distribution, and both the source and the target corpora have a comparable distribution graph. Additionally, these embeddings are isometric. These cross-lingual word embeddings are used for cross-lingual transfer learning and unsupervised neural machine translation. This research aims to improve the accuracy and efficiency of word translation between different language pairs by employing different retrieval techniques. The study analyzes the effectiveness of these techniques on different language pairs, including English-Hindi, English-Punjabi, English-Gujarati, English-Bengali, and English-Marathi. The research is expected to contribute significantly to the field of language translation by introducing innovative methods and other applications.
使用不同检索技术进行印度-雅利安语的词语翻译
单词嵌入使高级语言模型能够理解和生成类人文本,从而彻底改变了自然语言处理研究。在这篇研究文章中,我们将深入探讨词嵌入的世界,旨在对其基本原理、方法和应用进行全面探索。影响许多多语言语言处理活动的一个重要因素是单词翻译或纳入双语词典。我们使用双语词典或平行数据将一种语言翻译成另一种语言。在这项研究工作中,我们不仅要解决这个问题,还要为不同的语言对生成最佳的跨语言单词嵌入。因此,我们使用对齐文档句子对齐语料库或任何双语词典进行研究分析。对于最常出现的词,我们假设存在语内相似性分布,且源语料库和目标语料库都有可比较的分布图。此外,这些嵌入是等距的。这些跨语言词嵌入可用于跨语言迁移学习和无监督神经机器翻译。这项研究旨在通过采用不同的检索技术,提高不同语言对之间单词翻译的准确性和效率。研究分析了这些技术在不同语言对(包括英语-印度语、英语-邦加比语、英语-古吉拉特语、英语-孟加拉语和英语-马拉地语)上的有效性。通过引入创新方法和其他应用,该研究有望为语言翻译领域做出重大贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信