Afan Oromo Sense Clustering in Hierarchical and Partitional Techniques

Workineh Tesema Gudisa
{"title":"Afan Oromo Sense Clustering in Hierarchical and Partitional Techniques","authors":"Workineh Tesema Gudisa","doi":"10.4172/2165-7866.1000191","DOIUrl":null,"url":null,"abstract":"This paper presents the sense clustering of multi-sense words in Afan Oromo. The main idea of this work is to cluster contexts which is providing a useful way to discover semantically related senses. The similar contexts of a given senses of target word are clustered using three hierarchical and two partitional clustering. All contexts of related senses are included in the clustering and thus performed over all the contexts in the corpus. The underlying hypothesis is that clustering captures the reflected unity among the contexts and each cluster reveal possible relationships existing among the contexts. As the experiment shows, from the total five clusters, the EM and K-Means clusters which yield significantly higher accuracy than hierarchical (single clustering, complete clustering and average clustering) result. For Afan Oromo, EM and K-means enhance the accuracy of sense clustering than hierarchical clustering algorithms. Each cluster representing a unique sense. Some words have two senses to the five senses. As the result shows an average accuracy of test set was 85.5% which is encouraging with the unsupervised machine learning work. By using this approach, finding the right number of clusters is equivalent to finding the number of senses. The achieved result was encouraging, despite it is less resource requirement.","PeriodicalId":91908,"journal":{"name":"Journal of information technology & software engineering","volume":"6 1","pages":"1-4"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4172/2165-7866.1000191","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of information technology & software engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4172/2165-7866.1000191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

This paper presents the sense clustering of multi-sense words in Afan Oromo. The main idea of this work is to cluster contexts which is providing a useful way to discover semantically related senses. The similar contexts of a given senses of target word are clustered using three hierarchical and two partitional clustering. All contexts of related senses are included in the clustering and thus performed over all the contexts in the corpus. The underlying hypothesis is that clustering captures the reflected unity among the contexts and each cluster reveal possible relationships existing among the contexts. As the experiment shows, from the total five clusters, the EM and K-Means clusters which yield significantly higher accuracy than hierarchical (single clustering, complete clustering and average clustering) result. For Afan Oromo, EM and K-means enhance the accuracy of sense clustering than hierarchical clustering algorithms. Each cluster representing a unique sense. Some words have two senses to the five senses. As the result shows an average accuracy of test set was 85.5% which is encouraging with the unsupervised machine learning work. By using this approach, finding the right number of clusters is equivalent to finding the number of senses. The achieved result was encouraging, despite it is less resource requirement.
层次和分割技术中的Afan Oromo感觉聚类
本文研究了阿凡奥罗莫语中多义词的语义聚类。这项工作的主要思想是对上下文进行聚类,这为发现语义相关的感官提供了一种有用的方法。本文采用三阶聚类和两阶聚类的方法对目标词词义的相似上下文进行聚类。所有相关感官的上下文都包含在聚类中,从而对语料库中的所有上下文进行聚类。其基本假设是,聚类捕捉了上下文之间反映的统一性,每个聚类揭示了上下文之间存在的可能关系。实验表明,在5个聚类中,EM和K-Means聚类的准确率明显高于分层聚类(单聚类、完全聚类和平均聚类)结果。对于Afan Oromo, EM和K-means比分层聚类算法提高了感觉聚类的准确性。每个集群代表一种独特的感觉。有些词有两种感官而不是五种感官。结果表明,测试集的平均准确率为85.5%,这对于无监督机器学习工作来说是令人鼓舞的。通过使用这种方法,找到正确数量的集群相当于找到感官的数量。所取得的成果令人鼓舞,尽管所需资源较少。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信