{"title":"在有限的空间内快速排名","authors":"Alistair Moffat, J. Zobel","doi":"10.1109/ICDE.1994.283064","DOIUrl":null,"url":null,"abstract":"Ranking techniques have long been suggested as alternatives to conventional Boolean methods for searching document collections. The cost of computing a ranking is, however, greater than the cost of performing a Boolean search, in terms of both memory space and processing time. The authors consider the resources required by the cosine method of ranking, and show that, with a careful application of indexing and selection techniques, both the space and the time required by ranking can be substantially reduced. The methods described in the paper have been used to build a retrieval system with which it is possible to process ranked queries of 40 terms in about 5% of the space required by previous implementations; in as little as 25% of the time; and without measurable degradation in retrieval effectiveness.<<ETX>>","PeriodicalId":142465,"journal":{"name":"Proceedings of 1994 IEEE 10th International Conference on Data Engineering","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":"{\"title\":\"Fast ranking in limited space\",\"authors\":\"Alistair Moffat, J. Zobel\",\"doi\":\"10.1109/ICDE.1994.283064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ranking techniques have long been suggested as alternatives to conventional Boolean methods for searching document collections. The cost of computing a ranking is, however, greater than the cost of performing a Boolean search, in terms of both memory space and processing time. The authors consider the resources required by the cosine method of ranking, and show that, with a careful application of indexing and selection techniques, both the space and the time required by ranking can be substantially reduced. The methods described in the paper have been used to build a retrieval system with which it is possible to process ranked queries of 40 terms in about 5% of the space required by previous implementations; in as little as 25% of the time; and without measurable degradation in retrieval effectiveness.<<ETX>>\",\"PeriodicalId\":142465,\"journal\":{\"name\":\"Proceedings of 1994 IEEE 10th International Conference on Data Engineering\",\"volume\":\"96 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1994-02-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"37\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 1994 IEEE 10th International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.1994.283064\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 1994 IEEE 10th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.1994.283064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Ranking techniques have long been suggested as alternatives to conventional Boolean methods for searching document collections. The cost of computing a ranking is, however, greater than the cost of performing a Boolean search, in terms of both memory space and processing time. The authors consider the resources required by the cosine method of ranking, and show that, with a careful application of indexing and selection techniques, both the space and the time required by ranking can be substantially reduced. The methods described in the paper have been used to build a retrieval system with which it is possible to process ranked queries of 40 terms in about 5% of the space required by previous implementations; in as little as 25% of the time; and without measurable degradation in retrieval effectiveness.<>