Word Sense Disambiguation in Bengali Using Sense Induction

Anindya Sau, Tarik Aziz Amin, Nabagata Barman, A. R. Pal
{"title":"Word Sense Disambiguation in Bengali Using Sense Induction","authors":"Anindya Sau, Tarik Aziz Amin, Nabagata Barman, A. R. Pal","doi":"10.1109/ICAML48257.2019.00040","DOIUrl":null,"url":null,"abstract":"In this paper an algorithm is proposed for Word Sense Disambiguation in Bengali language using Sense Induction technique. The overall work is carried out in two phases. In the first phase, different sense clusters are created using Sense Induction technique and in the second phase, Word Sense Disambiguation is developed using Semantic Similarity Measure. The data sets are prepared from the corpus, developed under the TDIL (Technology Development for Indian Languages) project of the Government of India. The developed model is tested on 10 commonly used Bengali ambiguous words, each of which is having approximately 200 sentences. The overall accuracy is achieved as 63.71% in Word Sense Disambiguation task. The challenges and the pitfalls of this work are explained in detail at the end of this paper.","PeriodicalId":369667,"journal":{"name":"2019 International Conference on Applied Machine Learning (ICAML)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Applied Machine Learning (ICAML)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAML48257.2019.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

In this paper an algorithm is proposed for Word Sense Disambiguation in Bengali language using Sense Induction technique. The overall work is carried out in two phases. In the first phase, different sense clusters are created using Sense Induction technique and in the second phase, Word Sense Disambiguation is developed using Semantic Similarity Measure. The data sets are prepared from the corpus, developed under the TDIL (Technology Development for Indian Languages) project of the Government of India. The developed model is tested on 10 commonly used Bengali ambiguous words, each of which is having approximately 200 sentences. The overall accuracy is achieved as 63.71% in Word Sense Disambiguation task. The challenges and the pitfalls of this work are explained in detail at the end of this paper.
用语义归纳法消除孟加拉语词义歧义
本文提出了一种基于语义归纳法的孟加拉语词义消歧算法。整体工作分两个阶段进行。在第一阶段,使用语义归纳技术创建不同的语义聚类,在第二阶段,使用语义相似度度量开发词义消歧。这些数据集是根据印度政府的TDIL(印度语言技术发展)项目编制的语料库编制的。对10个常用的孟加拉语歧义词进行了测试,每个词大约有200个句子。在词义消歧任务中,总体准确率达到63.71%。本文的最后详细说明了这项工作的挑战和陷阱。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信