Word Sense Disambiguation in Bengali Using Sense Induction

2019 International Conference on Applied Machine Learning (ICAML) Pub Date : 2019-05-01 DOI:10.1109/ICAML48257.2019.00040

Anindya Sau, Tarik Aziz Amin, Nabagata Barman, A. R. Pal

引用次数: 3

Abstract

In this paper an algorithm is proposed for Word Sense Disambiguation in Bengali language using Sense Induction technique. The overall work is carried out in two phases. In the first phase, different sense clusters are created using Sense Induction technique and in the second phase, Word Sense Disambiguation is developed using Semantic Similarity Measure. The data sets are prepared from the corpus, developed under the TDIL (Technology Development for Indian Languages) project of the Government of India. The developed model is tested on 10 commonly used Bengali ambiguous words, each of which is having approximately 200 sentences. The overall accuracy is achieved as 63.71% in Word Sense Disambiguation task. The challenges and the pitfalls of this work are explained in detail at the end of this paper.

查看原文本刊更多论文

用语义归纳法消除孟加拉语词义歧义

本文提出了一种基于语义归纳法的孟加拉语词义消歧算法。整体工作分两个阶段进行。在第一阶段，使用语义归纳技术创建不同的语义聚类，在第二阶段，使用语义相似度度量开发词义消歧。这些数据集是根据印度政府的TDIL(印度语言技术发展)项目编制的语料库编制的。对10个常用的孟加拉语歧义词进行了测试，每个词大约有200个句子。在词义消歧任务中，总体准确率达到63.71%。本文的最后详细说明了这项工作的挑战和陷阱。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 International Conference on Applied Machine Learning (ICAML)

自引率

0.00%

发文量