{"title":"在异构上下文中选择集合的方法","authors":"Faïza Abbaci, M. Beigbeder, J. Savoy","doi":"10.1109/ITCC.2002.1000443","DOIUrl":null,"url":null,"abstract":"Demonstrates that, in an ideal distributed information retrieval environment, it can be effective to take into account the ability of each collection server to return relevant documents when selecting collections. Based on this assumption, we suggest a new approach to resolve the collection selection problem. In order to predict a collection's ability to return relevant documents, we inspect a limited number (n) of documents retrieved from each collection and analyse the proximity of search keywords within them. In our experiments, we vary the underlying parameter n of our suggested model in order to define the most appropriate number of top documents to be inspected. Moreover, we evaluate the retrieval effectiveness of our approach and compare it with both the centralized indexing and the CORI (COllection Retrieval Inference) approaches. Preliminary results from these experiments, conducted on the WT10g test collection of Web pages, tend to demonstrate that our suggested method can achieve appreciable retrieval effectiveness.","PeriodicalId":115190,"journal":{"name":"Proceedings. International Conference on Information Technology: Coding and Computing","volume":"316 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"A methodology for collection selection in heterogeneous contexts\",\"authors\":\"Faïza Abbaci, M. Beigbeder, J. Savoy\",\"doi\":\"10.1109/ITCC.2002.1000443\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Demonstrates that, in an ideal distributed information retrieval environment, it can be effective to take into account the ability of each collection server to return relevant documents when selecting collections. Based on this assumption, we suggest a new approach to resolve the collection selection problem. In order to predict a collection's ability to return relevant documents, we inspect a limited number (n) of documents retrieved from each collection and analyse the proximity of search keywords within them. In our experiments, we vary the underlying parameter n of our suggested model in order to define the most appropriate number of top documents to be inspected. Moreover, we evaluate the retrieval effectiveness of our approach and compare it with both the centralized indexing and the CORI (COllection Retrieval Inference) approaches. Preliminary results from these experiments, conducted on the WT10g test collection of Web pages, tend to demonstrate that our suggested method can achieve appreciable retrieval effectiveness.\",\"PeriodicalId\":115190,\"journal\":{\"name\":\"Proceedings. International Conference on Information Technology: Coding and Computing\",\"volume\":\"316 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. International Conference on Information Technology: Coding and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITCC.2002.1000443\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Information Technology: Coding and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITCC.2002.1000443","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A methodology for collection selection in heterogeneous contexts
Demonstrates that, in an ideal distributed information retrieval environment, it can be effective to take into account the ability of each collection server to return relevant documents when selecting collections. Based on this assumption, we suggest a new approach to resolve the collection selection problem. In order to predict a collection's ability to return relevant documents, we inspect a limited number (n) of documents retrieved from each collection and analyse the proximity of search keywords within them. In our experiments, we vary the underlying parameter n of our suggested model in order to define the most appropriate number of top documents to be inspected. Moreover, we evaluate the retrieval effectiveness of our approach and compare it with both the centralized indexing and the CORI (COllection Retrieval Inference) approaches. Preliminary results from these experiments, conducted on the WT10g test collection of Web pages, tend to demonstrate that our suggested method can achieve appreciable retrieval effectiveness.