Analysis of Machine-Based Learning Algorithm Used in Named Entity Recognition

Informing Science Pub Date : 2023-01-01 DOI:10.28945/5073
F. M. Kamau, Kennedy Ogada, Cheruiyot W. Kipruto
{"title":"Analysis of Machine-Based Learning Algorithm Used in Named Entity Recognition","authors":"F. M. Kamau, Kennedy Ogada, Cheruiyot W. Kipruto","doi":"10.28945/5073","DOIUrl":null,"url":null,"abstract":"Aim/Purpose: The amount of information published has increased dramatically due to the information explosion. The issue of managing information as it expands at this rate lies in the development of information extraction technology that can turn unstructured data into organized data that is understandable and controllable by computers Background: The primary goal of named entity recognition (NER) is to extract named entities from amorphous materials and place them in pre-defined semantic classes. Methodology: In our work, we analyze various machine learning algorithms and implement K-NN which has been widely used in machine learning and remains one of the most popular methods to classify data. Contribution: To the researchers’ best knowledge, no published study has presented Named entity recognition for the Kikuyu language using a machine learning algorithm. This research will fill this gap by recognizing entities in the Kikuyu language. Findings: An evaluation was done by testing precision, recall, and F-measure. The experiment results demonstrate that using K-NN is effective in classification performance. Recommendation for Researchers: With enough training data, researchers could perform an experiment and check the learning curve with accuracy that compares to state of art NER. Future Research: Future studies may be done using unsupervised and semi-supervised learning algorithms for other resource-scarce languages.","PeriodicalId":39754,"journal":{"name":"Informing Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informing Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.28945/5073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Aim/Purpose: The amount of information published has increased dramatically due to the information explosion. The issue of managing information as it expands at this rate lies in the development of information extraction technology that can turn unstructured data into organized data that is understandable and controllable by computers Background: The primary goal of named entity recognition (NER) is to extract named entities from amorphous materials and place them in pre-defined semantic classes. Methodology: In our work, we analyze various machine learning algorithms and implement K-NN which has been widely used in machine learning and remains one of the most popular methods to classify data. Contribution: To the researchers’ best knowledge, no published study has presented Named entity recognition for the Kikuyu language using a machine learning algorithm. This research will fill this gap by recognizing entities in the Kikuyu language. Findings: An evaluation was done by testing precision, recall, and F-measure. The experiment results demonstrate that using K-NN is effective in classification performance. Recommendation for Researchers: With enough training data, researchers could perform an experiment and check the learning curve with accuracy that compares to state of art NER. Future Research: Future studies may be done using unsupervised and semi-supervised learning algorithms for other resource-scarce languages.
命名实体识别中的机器学习算法分析
目的/目的:由于信息爆炸,发布的信息量急剧增加。随着信息以这种速度扩展,管理信息的问题在于信息提取技术的发展,该技术可以将非结构化数据转化为计算机可以理解和控制的有组织数据。背景:命名实体识别(NER)的主要目标是从无定形材料中提取命名实体,并将它们置于预定义的语义类中。方法:在我们的工作中,我们分析了各种机器学习算法,并实现了在机器学习中广泛使用的K-NN,它仍然是最流行的数据分类方法之一。贡献:据研究人员所知,没有发表的研究使用机器学习算法对基库尤语进行命名实体识别。这项研究将通过识别基库尤语中的实体来填补这一空白。结果:通过检测精密度、召回率和f值进行评价。实验结果表明,使用K-NN在分类性能上是有效的。给研究人员的建议:有了足够的训练数据,研究人员可以进行实验,并以与最先进的NER相比的准确性检查学习曲线。未来研究:未来的研究可能会对其他资源稀缺的语言使用无监督和半监督学习算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Informing Science
Informing Science Social Sciences-Library and Information Sciences
CiteScore
1.60
自引率
0.00%
发文量
9
期刊介绍: The academically peer refereed journal Informing Science endeavors to provide an understanding of the complexities in informing clientele. Fields from information systems, library science, journalism in all its forms to education all contribute to this science. These fields, which developed independently and have been researched in separate disciplines, are evolving to form a new transdiscipline, Informing Science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信