{"title":"伪相关反馈的两阶段排序方案","authors":"Rong Yan, Guanglai Gao","doi":"10.1109/ICISCE.2016.38","DOIUrl":null,"url":null,"abstract":"As for the majority methods of Pseudo Relevance Feedback (PRF), the document in pseudo relevant set is generally divided into the relevant and the non-relevant according to user query. It is so coarse that the lower robustness of PRF, because there is still some relevant information in the non-relevant document and non-relevant information in the relevant document. A novel ranking scheme is proposed in this paper in order to accomplish a higher quality of pseudo relevant set. We try to realize automatically topic content analysis for pseudo relevant set, and divide pseudo relevant set into the relevant and the non-relevant at the document content level, so as to extract semantic relevant content for further selecting good expansion terms based on a smaller granularity, which would not worry about the cases that the top-ranked documents contain very few relevant documents. The experimental results on real Chinese collection show that our scheme can significantly improve the performance of retrieval.","PeriodicalId":6882,"journal":{"name":"2016 3rd International Conference on Information Science and Control Engineering (ICISCE)","volume":"42 1","pages":"129-133"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Two-Stage Ranking Scheme for Pseudo Relevance Feedback\",\"authors\":\"Rong Yan, Guanglai Gao\",\"doi\":\"10.1109/ICISCE.2016.38\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As for the majority methods of Pseudo Relevance Feedback (PRF), the document in pseudo relevant set is generally divided into the relevant and the non-relevant according to user query. It is so coarse that the lower robustness of PRF, because there is still some relevant information in the non-relevant document and non-relevant information in the relevant document. A novel ranking scheme is proposed in this paper in order to accomplish a higher quality of pseudo relevant set. We try to realize automatically topic content analysis for pseudo relevant set, and divide pseudo relevant set into the relevant and the non-relevant at the document content level, so as to extract semantic relevant content for further selecting good expansion terms based on a smaller granularity, which would not worry about the cases that the top-ranked documents contain very few relevant documents. The experimental results on real Chinese collection show that our scheme can significantly improve the performance of retrieval.\",\"PeriodicalId\":6882,\"journal\":{\"name\":\"2016 3rd International Conference on Information Science and Control Engineering (ICISCE)\",\"volume\":\"42 1\",\"pages\":\"129-133\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 3rd International Conference on Information Science and Control Engineering (ICISCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICISCE.2016.38\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 3rd International Conference on Information Science and Control Engineering (ICISCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISCE.2016.38","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Two-Stage Ranking Scheme for Pseudo Relevance Feedback
As for the majority methods of Pseudo Relevance Feedback (PRF), the document in pseudo relevant set is generally divided into the relevant and the non-relevant according to user query. It is so coarse that the lower robustness of PRF, because there is still some relevant information in the non-relevant document and non-relevant information in the relevant document. A novel ranking scheme is proposed in this paper in order to accomplish a higher quality of pseudo relevant set. We try to realize automatically topic content analysis for pseudo relevant set, and divide pseudo relevant set into the relevant and the non-relevant at the document content level, so as to extract semantic relevant content for further selecting good expansion terms based on a smaller granularity, which would not worry about the cases that the top-ranked documents contain very few relevant documents. The experimental results on real Chinese collection show that our scheme can significantly improve the performance of retrieval.