{"title":"用LCS压缩一个反向文件","authors":"Fang-Yie Leu, Yao-Chung Fan","doi":"10.1109/CMPSAC.2004.1342676","DOIUrl":null,"url":null,"abstract":"The document index construction is one of the most important concerns in designing an information retrieval system. The most common index structure used in document retrieval is the inverted file, which consists of inverted lists holding lists of pointers to all the locations of the given terms in the documents collected. The size of an inverted file can be reduced by the use of compression techniques. We exploit randomized minimum spanning tree (MST) algorithm, which uses the spanning tree verification and randomized sampling.","PeriodicalId":355273,"journal":{"name":"Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004.","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Compressing an inverted file with LCS\",\"authors\":\"Fang-Yie Leu, Yao-Chung Fan\",\"doi\":\"10.1109/CMPSAC.2004.1342676\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The document index construction is one of the most important concerns in designing an information retrieval system. The most common index structure used in document retrieval is the inverted file, which consists of inverted lists holding lists of pointers to all the locations of the given terms in the documents collected. The size of an inverted file can be reduced by the use of compression techniques. We exploit randomized minimum spanning tree (MST) algorithm, which uses the spanning tree verification and randomized sampling.\",\"PeriodicalId\":355273,\"journal\":{\"name\":\"Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004.\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CMPSAC.2004.1342676\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CMPSAC.2004.1342676","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The document index construction is one of the most important concerns in designing an information retrieval system. The most common index structure used in document retrieval is the inverted file, which consists of inverted lists holding lists of pointers to all the locations of the given terms in the documents collected. The size of an inverted file can be reduced by the use of compression techniques. We exploit randomized minimum spanning tree (MST) algorithm, which uses the spanning tree verification and randomized sampling.