阿法安奥罗莫语词义消歧的归一化统计算法

Q3 Computer Science

International Journal of Intelligent Systems and Applications in Engineering Pub Date : 2021-12-08 DOI:10.5815/ijisa.2021.06.04

A. Abafogi

{"title":"阿法安奥罗莫语词义消歧的归一化统计算法","authors":"A. Abafogi","doi":"10.5815/ijisa.2021.06.04","DOIUrl":null,"url":null,"abstract":"Language is the main means of communication used by human. In various situations, the same word can mean differently based on the usage of the word in a particular sentence which is challenging for a computer to understand as level of human. Word Sense Disambiguation (WSD), which aims to identify correct sense of a given ambiguity word, is a long-standing problem in natural language processing (NLP). As the major aim of WSD is to accurately understand the sense of a word in particular context, can be used for the correct labeling of words in natural language applications. In this paper, I propose a normalized statistical algorithm that performs the task of WSD for Afaan Oromo language despite morphological analysis The propose algorithm has the power to discriminate ambiguous word’s sense without windows size consideration, without predefined rule and without utilize annotated dataset for training which minimize a challenge of under resource languages. The proposed system tested on 249 sentences with precision, recall, and F-measure. The overall effectiveness of the system is 80.76% in F-measure, which implies that the proposed system is promising on Afaan Oromo that is one of under resource languages spoken in East Africa. The algorithm can be extended for semantic text similarity without modification or with a bit modification. Furthermore, the forwarded direction can improve the performance of the proposed algorithm.","PeriodicalId":14067,"journal":{"name":"International Journal of Intelligent Systems and Applications in Engineering","volume":"77 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Normalized Statistical Algorithm for Afaan Oromo Word Sense Disambiguation\",\"authors\":\"A. Abafogi\",\"doi\":\"10.5815/ijisa.2021.06.04\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Language is the main means of communication used by human. In various situations, the same word can mean differently based on the usage of the word in a particular sentence which is challenging for a computer to understand as level of human. Word Sense Disambiguation (WSD), which aims to identify correct sense of a given ambiguity word, is a long-standing problem in natural language processing (NLP). As the major aim of WSD is to accurately understand the sense of a word in particular context, can be used for the correct labeling of words in natural language applications. In this paper, I propose a normalized statistical algorithm that performs the task of WSD for Afaan Oromo language despite morphological analysis The propose algorithm has the power to discriminate ambiguous word’s sense without windows size consideration, without predefined rule and without utilize annotated dataset for training which minimize a challenge of under resource languages. The proposed system tested on 249 sentences with precision, recall, and F-measure. The overall effectiveness of the system is 80.76% in F-measure, which implies that the proposed system is promising on Afaan Oromo that is one of under resource languages spoken in East Africa. The algorithm can be extended for semantic text similarity without modification or with a bit modification. Furthermore, the forwarded direction can improve the performance of the proposed algorithm.\",\"PeriodicalId\":14067,\"journal\":{\"name\":\"International Journal of Intelligent Systems and Applications in Engineering\",\"volume\":\"77 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Intelligent Systems and Applications in Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5815/ijisa.2021.06.04\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Systems and Applications in Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5815/ijisa.2021.06.04","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

摘要

语言是人类交流的主要手段。在不同的情况下，同一个单词可以根据它在特定句子中的用法而有不同的意思，这对于计算机来说是具有挑战性的。词义消歧是自然语言处理(NLP)中一个长期存在的问题，其目的是对给定的歧义词进行正确的词义识别。由于WSD的主要目的是准确理解单词在特定上下文中的意义，因此可以用于自然语言应用中正确标记单词。在本文中，我提出了一种标准化的统计算法来执行Afaan Oromo语言的WSD任务，该算法具有在不考虑窗口大小、不使用预定义规则和不使用带注释的数据集进行训练的情况下区分歧义词的能力，从而最大限度地减少了资源不足语言的挑战。该系统对249个句子进行了精度、召回率和F-measure测试。该系统的F-measure总体有效性为80.76%，这意味着该系统在Afaan Oromo(东非资源不足的语言之一)上是有希望的。该算法可以在不修改或稍作修改的情况下进行语义文本相似度的扩展。此外，转发方向可以提高算法的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Normalized Statistical Algorithm for Afaan Oromo Word Sense Disambiguation

Language is the main means of communication used by human. In various situations, the same word can mean differently based on the usage of the word in a particular sentence which is challenging for a computer to understand as level of human. Word Sense Disambiguation (WSD), which aims to identify correct sense of a given ambiguity word, is a long-standing problem in natural language processing (NLP). As the major aim of WSD is to accurately understand the sense of a word in particular context, can be used for the correct labeling of words in natural language applications. In this paper, I propose a normalized statistical algorithm that performs the task of WSD for Afaan Oromo language despite morphological analysis The propose algorithm has the power to discriminate ambiguous word’s sense without windows size consideration, without predefined rule and without utilize annotated dataset for training which minimize a challenge of under resource languages. The proposed system tested on 249 sentences with precision, recall, and F-measure. The overall effectiveness of the system is 80.76% in F-measure, which implies that the proposed system is promising on Afaan Oromo that is one of under resource languages spoken in East Africa. The algorithm can be extended for semantic text similarity without modification or with a bit modification. Furthermore, the forwarded direction can improve the performance of the proposed algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Intelligent Systems and Applications in Engineering Computer Science-Computer Graphics and Computer-Aided Design

CiteScore

1.30

自引率

0.00%

发文量