Analyzing Malay Stemmer Performance Towards Fuzzy Logic Ranking Function on Malay Text Corpus

Shaiful Bakhtiar bin Rodzman, Mohamad Fitri Izuan Abdul Ronie, N. K. Ismail, Nurazzah Abd Rahman, F. Ahmad, Z. M. Nor
{"title":"Analyzing Malay Stemmer Performance Towards Fuzzy Logic Ranking Function on Malay Text Corpus","authors":"Shaiful Bakhtiar bin Rodzman, Mohamad Fitri Izuan Abdul Ronie, N. K. Ismail, Nurazzah Abd Rahman, F. Ahmad, Z. M. Nor","doi":"10.1109/INFRKM.2018.8464767","DOIUrl":null,"url":null,"abstract":"In a way to make the result of Information Retrieval (IR) more accurate, a stemmer is needed to differentiate the words in searching useful information. This research aims to analyze both processing speed and accuracy of the Malay Language Stemmer such as Fatimah Stemmer and UniSZA Stemmer. This research will also compare the performance of Fuzzy Logic Ranking Function using the both stemmer. Evaluation of Recall and Precision using the relevant judgement list by the expert. The results presented UniSZA Stemmer clearly dominated the Fatimah Stemmer processing speed performance with faster times recorded in each set of the experiment, however, in term of accuracy, unfortunately Fatimah Stemmer has clearly dominated the UniSZA stemming accuracy performance with having much more correct stemmed words for each set of the experiment. The results also showed that Fuzzy Logic Ranking with Fatimah Stemmer has outperformed Fuzzy Logic Ranking with UniSZA Stemmer and English Porter Stemmer on 5 out of 8 Topic Set of query results on the Mean Average Precision measure. Fuzzy Logic Ranking with Fatimah Stemmer also gets the best result on the Precision at Rank 10, Mean Average Precision and the percentage of no relevant document in the top ten retrieved measures, on the topic that has most queries which is topic ‘Umum’ that has a total of 11 queries.","PeriodicalId":196731,"journal":{"name":"2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFRKM.2018.8464767","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

In a way to make the result of Information Retrieval (IR) more accurate, a stemmer is needed to differentiate the words in searching useful information. This research aims to analyze both processing speed and accuracy of the Malay Language Stemmer such as Fatimah Stemmer and UniSZA Stemmer. This research will also compare the performance of Fuzzy Logic Ranking Function using the both stemmer. Evaluation of Recall and Precision using the relevant judgement list by the expert. The results presented UniSZA Stemmer clearly dominated the Fatimah Stemmer processing speed performance with faster times recorded in each set of the experiment, however, in term of accuracy, unfortunately Fatimah Stemmer has clearly dominated the UniSZA stemming accuracy performance with having much more correct stemmed words for each set of the experiment. The results also showed that Fuzzy Logic Ranking with Fatimah Stemmer has outperformed Fuzzy Logic Ranking with UniSZA Stemmer and English Porter Stemmer on 5 out of 8 Topic Set of query results on the Mean Average Precision measure. Fuzzy Logic Ranking with Fatimah Stemmer also gets the best result on the Precision at Rank 10, Mean Average Precision and the percentage of no relevant document in the top ten retrieved measures, on the topic that has most queries which is topic ‘Umum’ that has a total of 11 queries.
马来语词干对马来语语料库模糊逻辑排序功能的性能分析
为了提高信息检索结果的准确性,在检索有用信息时需要一个词干来区分词。本研究旨在分析法蒂玛词干和UniSZA词干等马来语词干的处理速度和准确性。本研究也将比较两种系统的模糊逻辑排序函数的效能。专家使用相关判断表对查全率和查准率进行评估。结果显示,UniSZA Stemmer在处理速度表现上明显优于Fatimah Stemmer,在每组实验中记录的时间都更快,然而,在准确性方面,不幸的是,Fatimah Stemmer在每组实验中都有更多正确的词干,在UniSZA词干准确性表现上明显优于Fatimah Stemmer。结果还表明,在Mean Average Precision测量的8个查询结果主题集中,有5个主题集上,Fatimah Stemmer的模糊逻辑排名优于UniSZA Stemmer和English Porter Stemmer的模糊逻辑排名。使用Fatimah Stemmer的模糊逻辑排名在排名10的精度上也得到了最好的结果,平均精度和前十大检索措施中没有相关文档的百分比,关于有最多查询的主题是主题“Umum”,总共有11个查询。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信