Anindya Sau, Tarik Aziz Amin, Nabagata Barman, A. R. Pal
{"title":"Word Sense Disambiguation in Bengali Using Sense Induction","authors":"Anindya Sau, Tarik Aziz Amin, Nabagata Barman, A. R. Pal","doi":"10.1109/ICAML48257.2019.00040","DOIUrl":null,"url":null,"abstract":"In this paper an algorithm is proposed for Word Sense Disambiguation in Bengali language using Sense Induction technique. The overall work is carried out in two phases. In the first phase, different sense clusters are created using Sense Induction technique and in the second phase, Word Sense Disambiguation is developed using Semantic Similarity Measure. The data sets are prepared from the corpus, developed under the TDIL (Technology Development for Indian Languages) project of the Government of India. The developed model is tested on 10 commonly used Bengali ambiguous words, each of which is having approximately 200 sentences. The overall accuracy is achieved as 63.71% in Word Sense Disambiguation task. The challenges and the pitfalls of this work are explained in detail at the end of this paper.","PeriodicalId":369667,"journal":{"name":"2019 International Conference on Applied Machine Learning (ICAML)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Applied Machine Learning (ICAML)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAML48257.2019.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In this paper an algorithm is proposed for Word Sense Disambiguation in Bengali language using Sense Induction technique. The overall work is carried out in two phases. In the first phase, different sense clusters are created using Sense Induction technique and in the second phase, Word Sense Disambiguation is developed using Semantic Similarity Measure. The data sets are prepared from the corpus, developed under the TDIL (Technology Development for Indian Languages) project of the Government of India. The developed model is tested on 10 commonly used Bengali ambiguous words, each of which is having approximately 200 sentences. The overall accuracy is achieved as 63.71% in Word Sense Disambiguation task. The challenges and the pitfalls of this work are explained in detail at the end of this paper.