Design and Development of Enhanced Morphological Analyzer for Ge’ez Verbs Using Memory Based Learning Algorithms

Gebremeskel Hagos Gebremedhin, F. Wang
{"title":"Design and Development of Enhanced Morphological Analyzer for Ge’ez Verbs Using Memory Based Learning Algorithms","authors":"Gebremeskel Hagos Gebremedhin, F. Wang","doi":"10.24940/theijst/2020/v8/i7/st2007-001","DOIUrl":null,"url":null,"abstract":"This paper is carefully designed for Ge’ez morphological analyzer. Ge’ez is the classical language of Ethiopia and still used as the liturgical language of Ethiopian Orthodox Tewahedo church. Many ancient literatures were written in Ge’ez. The literature includes religious texts and secular writings. The ancient philosophy, tradition, history and knowledge of Ethiopia were being written in Ge’ez. Morphological analyzer is one of the most important basic tools in automatic processing of any human language and analyses the naturally occurring word forms in a sentence and identifies the root word and its features. In this paper, MBL is used to automatically analyze the morphology of Ge’ez verbs via the concept of machine learning for training and analysis. TiMB’s IB2 and TRIBL2 algorithms have been used for the implementation. The performance of the system has been evaluated using 10-fold cross validation technique on the default and optimized parameter settings. The overall accuracy with optimized parameters using IB2 and TRIBL2 was 94.24% and 93.31%, respectively. Similarly, the overall precision, recall and F-score with optimized parameters using IB2 were 55.6%, 56.3% and 59.95%, respectively. In the same manner the precision, recall and F-score using TRIBL2 were 58.8%, 60.3% and 59.54%, respectively. Moreover, a learning curve was drawn. The graph showed that as the number of training dataset increase, the accuracy on unseen data can be increased. Therefore, IB2 algorithm shows better result thanTRIBL2 algorithm for Ge’ez verb morphology.","PeriodicalId":231256,"journal":{"name":"The International Journal of Science & Technoledge","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International Journal of Science & Technoledge","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24940/theijst/2020/v8/i7/st2007-001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper is carefully designed for Ge’ez morphological analyzer. Ge’ez is the classical language of Ethiopia and still used as the liturgical language of Ethiopian Orthodox Tewahedo church. Many ancient literatures were written in Ge’ez. The literature includes religious texts and secular writings. The ancient philosophy, tradition, history and knowledge of Ethiopia were being written in Ge’ez. Morphological analyzer is one of the most important basic tools in automatic processing of any human language and analyses the naturally occurring word forms in a sentence and identifies the root word and its features. In this paper, MBL is used to automatically analyze the morphology of Ge’ez verbs via the concept of machine learning for training and analysis. TiMB’s IB2 and TRIBL2 algorithms have been used for the implementation. The performance of the system has been evaluated using 10-fold cross validation technique on the default and optimized parameter settings. The overall accuracy with optimized parameters using IB2 and TRIBL2 was 94.24% and 93.31%, respectively. Similarly, the overall precision, recall and F-score with optimized parameters using IB2 were 55.6%, 56.3% and 59.95%, respectively. In the same manner the precision, recall and F-score using TRIBL2 were 58.8%, 60.3% and 59.54%, respectively. Moreover, a learning curve was drawn. The graph showed that as the number of training dataset increase, the accuracy on unseen data can be increased. Therefore, IB2 algorithm shows better result thanTRIBL2 algorithm for Ge’ez verb morphology.
基于记忆学习算法的改进型动词形态分析器的设计与开发
本文是为格氏形态分析仪精心设计的。格伊兹语是埃塞俄比亚的古典语言,至今仍被用作埃塞俄比亚东正教特瓦赫多教堂的礼拜语言。许多古代文献都是用革以斯语写的。文学作品包括宗教文本和世俗作品。埃塞俄比亚的古代哲学、传统、历史和知识都是用geez书写的。词形分析器是人类语言自动处理中最重要的基本工具之一,它分析句子中自然出现的词形,识别词根及其特征。在本文中,MBL通过机器学习的概念来自动分析Ge 'ez动词的形态,以进行训练和分析。TiMB的IB2和TRIBL2算法已被用于实现。使用10倍交叉验证技术对默认参数和优化参数设置进行了系统性能评估。IB2和TRIBL2优化后的总体准确度分别为94.24%和93.31%。同样,IB2优化后的总体查准率、查全率和f值分别为55.6%、56.3%和59.95%。同样,TRIBL2的准确率为58.8%,召回率为60.3%,F-score为59.54%。此外,还绘制了一条学习曲线。从图中可以看出,随着训练数据集数量的增加,对未见数据的准确率可以提高。因此,对于Ge’ez动词形态,IB2算法比tribl2算法表现出更好的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信