Experiment with Text Summarization as a Positive Hierarchical Fuzzy Logic Ranking Indicator for Domain Specific Retrieval of Malay Translated Hadith

2019 IEEE 9th Symposium on Computer Applications & Industrial Electronics (ISCAIE) Pub Date : 2019-04-01 DOI:10.1109/ISCAIE.2019.8743988

Shaiful Bakhtiar bin Rodzman, Normaly Kamal Ismail, Nurazzah Abd Rahman, Syed Ahmad Aljunid, Hayati Abd Rahman, Z. M. Nor, Ku Muhammad Naim Ku Khalif, Ahmad Yunus Mohd Noor

{"title":"Experiment with Text Summarization as a Positive Hierarchical Fuzzy Logic Ranking Indicator for Domain Specific Retrieval of Malay Translated Hadith","authors":"Shaiful Bakhtiar bin Rodzman, Normaly Kamal Ismail, Nurazzah Abd Rahman, Syed Ahmad Aljunid, Hayati Abd Rahman, Z. M. Nor, Ku Muhammad Naim Ku Khalif, Ahmad Yunus Mohd Noor","doi":"10.1109/ISCAIE.2019.8743988","DOIUrl":null,"url":null,"abstract":"Ranking function acts as a predictive algorithm that is used to establish a simple ordering of documents according to its relevance and this process shows the effectiveness, quality and the accuracy for the variety type of Information Retrieval (IR) such as, Domain Specific Retrieval of Malay Translated Hadith. In this research, a Hierarchical Fuzzy Logic Controller of Mamdani-type Fuzzy Inference System has been built to define the ranking function based on the BM25 Model. The model examines four-inputs which are Ontology BM25 Score, Fabrication Rate of Hadith, Shia Rate of Hadith from the previous works of the researchers and the New additional Positive Rate of Hadith. It also examines four-output values of Final Ranking Score which consist of three triangular membership functions. The new Positive Rate of hadith is based on the score value of the automatic text summarization that was executed in pre-processing phase. The proposed system has outperformed the BM25 original score and the Vector Space Model (VM) on 5 topic of queries and 26 queries in the term of individual queries, while the BM25 original score and Vector Space Model only yielded better result in 3 and 0 queries respectively on the P@10, %no measures and MAP. P@10 represent the values of Precision at Rank 10 P@10), %no measures represent the percentage of queries with no relevant documents in the top ten retrieved and MAP represents Mean Average Precision of the queries. The results show the proposed system have capability to demote negative documents and move up the relevant documents in the ranking list with positive indicator and its capability to recall unseen document with the application of ontology in text retrieval. For the future works, the researcher would like to apply the usage of new ranking indicator such as reliability score from the expert and the lay users of the Domain Specific Retrieval of Malay Translated Hadith.","PeriodicalId":369098,"journal":{"name":"2019 IEEE 9th Symposium on Computer Applications & Industrial Electronics (ISCAIE)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 9th Symposium on Computer Applications & Industrial Electronics (ISCAIE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCAIE.2019.8743988","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Ranking function acts as a predictive algorithm that is used to establish a simple ordering of documents according to its relevance and this process shows the effectiveness, quality and the accuracy for the variety type of Information Retrieval (IR) such as, Domain Specific Retrieval of Malay Translated Hadith. In this research, a Hierarchical Fuzzy Logic Controller of Mamdani-type Fuzzy Inference System has been built to define the ranking function based on the BM25 Model. The model examines four-inputs which are Ontology BM25 Score, Fabrication Rate of Hadith, Shia Rate of Hadith from the previous works of the researchers and the New additional Positive Rate of Hadith. It also examines four-output values of Final Ranking Score which consist of three triangular membership functions. The new Positive Rate of hadith is based on the score value of the automatic text summarization that was executed in pre-processing phase. The proposed system has outperformed the BM25 original score and the Vector Space Model (VM) on 5 topic of queries and 26 queries in the term of individual queries, while the BM25 original score and Vector Space Model only yielded better result in 3 and 0 queries respectively on the P@10, %no measures and MAP. P@10 represent the values of Precision at Rank 10 P@10), %no measures represent the percentage of queries with no relevant documents in the top ten retrieved and MAP represents Mean Average Precision of the queries. The results show the proposed system have capability to demote negative documents and move up the relevant documents in the ranking list with positive indicator and its capability to recall unseen document with the application of ontology in text retrieval. For the future works, the researcher would like to apply the usage of new ranking indicator such as reliability score from the expert and the lay users of the Domain Specific Retrieval of Malay Translated Hadith.

查看原文本刊更多论文

马来语翻译圣训领域特定检索的文本摘要正层次模糊逻辑排序指标实验

排序功能作为一种预测算法，用于根据其相关性建立文档的简单排序，该过程显示了各种类型的信息检索(IR)的有效性，质量和准确性，例如，马来语翻译圣训的特定领域检索。本文在BM25模型的基础上，构建了mamdani型模糊推理系统的层次模糊逻辑控制器来定义排序函数。该模型考察了本体BM25分数、圣训捏造率、前人著作中圣训的什叶派率和新增加的圣训正确率四个输入。它还检查了由三个三角形隶属函数组成的最终排名分数的四个输出值。新的圣训正确率是基于预处理阶段执行的自动文本摘要的得分值。本文提出的系统在5个主题查询和26个单个查询方面优于BM25原始分数和向量空间模型(VM)，而BM25原始分数和向量空间模型在P@10， %no measures和MAP上分别仅在3个和0个查询中取得更好的结果。P@10表示排名第10的精度(P@10)， %no表示在检索的前10个查询中没有相关文档的查询的百分比，MAP表示查询的平均精度。结果表明，本文提出的系统具有贬抑负面文档、提升相关文档排名的能力，并具有利用本体在文本检索中的应用对未见文档进行召回的能力。对于未来的工作，研究人员希望应用新的排名指标，如专家和非专业用户对马来语翻译圣训的领域特定检索的可靠性评分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE 9th Symposium on Computer Applications & Industrial Electronics (ISCAIE)

自引率

0.00%

发文量