{"title":"基于项关系的萤火虫算法改进伪相关反馈","authors":"Muhammad Fikri Hasani, Rila Mandala","doi":"10.1109/ICSECC51444.2020.9557560","DOIUrl":null,"url":null,"abstract":"When searching for information with an information retrieval (IR) system, sometimes the results of the search documents provided by the system do not match the information needs of the user. Pseudo Relevance Feedback (PRF) based Query expansion (QE) tries to overcome these problems by adding words that are expected to improve retrieval results from top N ranked documents retrieved. The use of firefly algorithm (FA) as one of the optimization methods has been proven by the previous study to improve the performance of the IR system. However, in that study the weighting of words was done using the rocchio function of the Pseudo Relevant Document (PRD), so it is feared that the performance of IR system will be reduced if the number of relevant documents in PRD is little or none at all. Therefore, scoring by term relationship between query and PRD is used in this study combined with rocchio algorithm. The results of the study showed that usage of term relationship word co-occurrence or word similarity can improve the performance of the IRS that was previously built. In addition, word co-occurrence with jaccard have the best performance compared to the previous study and other combinations. FA itself was able to choose the optimal terms, even though the number of top N ranked documents increased. Furthermore, the combination of term relationship and rocchio algorithm can increase the convergence rate than the ones without rocchio algorithm.","PeriodicalId":302689,"journal":{"name":"2020 IEEE International Conference on Sustainable Engineering and Creative Computing (ICSECC)","volume":"219 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving Pseudo Relevance Feedback with Term Relationship using Firefly Algorithm\",\"authors\":\"Muhammad Fikri Hasani, Rila Mandala\",\"doi\":\"10.1109/ICSECC51444.2020.9557560\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When searching for information with an information retrieval (IR) system, sometimes the results of the search documents provided by the system do not match the information needs of the user. Pseudo Relevance Feedback (PRF) based Query expansion (QE) tries to overcome these problems by adding words that are expected to improve retrieval results from top N ranked documents retrieved. The use of firefly algorithm (FA) as one of the optimization methods has been proven by the previous study to improve the performance of the IR system. However, in that study the weighting of words was done using the rocchio function of the Pseudo Relevant Document (PRD), so it is feared that the performance of IR system will be reduced if the number of relevant documents in PRD is little or none at all. Therefore, scoring by term relationship between query and PRD is used in this study combined with rocchio algorithm. The results of the study showed that usage of term relationship word co-occurrence or word similarity can improve the performance of the IRS that was previously built. In addition, word co-occurrence with jaccard have the best performance compared to the previous study and other combinations. FA itself was able to choose the optimal terms, even though the number of top N ranked documents increased. Furthermore, the combination of term relationship and rocchio algorithm can increase the convergence rate than the ones without rocchio algorithm.\",\"PeriodicalId\":302689,\"journal\":{\"name\":\"2020 IEEE International Conference on Sustainable Engineering and Creative Computing (ICSECC)\",\"volume\":\"219 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Sustainable Engineering and Creative Computing (ICSECC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSECC51444.2020.9557560\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Sustainable Engineering and Creative Computing (ICSECC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSECC51444.2020.9557560","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving Pseudo Relevance Feedback with Term Relationship using Firefly Algorithm
When searching for information with an information retrieval (IR) system, sometimes the results of the search documents provided by the system do not match the information needs of the user. Pseudo Relevance Feedback (PRF) based Query expansion (QE) tries to overcome these problems by adding words that are expected to improve retrieval results from top N ranked documents retrieved. The use of firefly algorithm (FA) as one of the optimization methods has been proven by the previous study to improve the performance of the IR system. However, in that study the weighting of words was done using the rocchio function of the Pseudo Relevant Document (PRD), so it is feared that the performance of IR system will be reduced if the number of relevant documents in PRD is little or none at all. Therefore, scoring by term relationship between query and PRD is used in this study combined with rocchio algorithm. The results of the study showed that usage of term relationship word co-occurrence or word similarity can improve the performance of the IRS that was previously built. In addition, word co-occurrence with jaccard have the best performance compared to the previous study and other combinations. FA itself was able to choose the optimal terms, even though the number of top N ranked documents increased. Furthermore, the combination of term relationship and rocchio algorithm can increase the convergence rate than the ones without rocchio algorithm.