{"title":"TB±tree: Index structure for Information Retrieval Systems","authors":"M. Fekihal, I. Jaluta, Dinesh Kumar Saini","doi":"10.1109/CSCESM.2015.7331890","DOIUrl":null,"url":null,"abstract":"Information Retrieval Systems (IR) is using different indexing techniques to retrieve information such as, Inverted files, and Signature files. However, Signature files are suitable for small IR systems due to its slow response, while inverted file have better response time but its space overhead is high. Moreover, inverted files use B±trees for single-word queries. In this paper, a new indexing structure called TB±tree to be used in the design of inverted files for large information retrieval systems. The TB±tree is a variant of the B±tree that supports single key-word queries and phrase queries efficiently. In TB±tree algorithms which represent each key-word stored in the index by a numeric value, and this numeric value can be used as encryption and inforce security. The numeric value for each keyword is stored in binary format, which may reduce the size of the index file by 19%. The records in TB±tree may be of variable length.","PeriodicalId":232149,"journal":{"name":"2015 Second International Conference on Computer Science, Computer Engineering, and Social Media (CSCESM)","volume":"50 10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Second International Conference on Computer Science, Computer Engineering, and Social Media (CSCESM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCESM.2015.7331890","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Information Retrieval Systems (IR) is using different indexing techniques to retrieve information such as, Inverted files, and Signature files. However, Signature files are suitable for small IR systems due to its slow response, while inverted file have better response time but its space overhead is high. Moreover, inverted files use B±trees for single-word queries. In this paper, a new indexing structure called TB±tree to be used in the design of inverted files for large information retrieval systems. The TB±tree is a variant of the B±tree that supports single key-word queries and phrase queries efficiently. In TB±tree algorithms which represent each key-word stored in the index by a numeric value, and this numeric value can be used as encryption and inforce security. The numeric value for each keyword is stored in binary format, which may reduce the size of the index file by 19%. The records in TB±tree may be of variable length.