{"title":"SIR: A statistical information retrieval system","authors":"C. D. Parsons","doi":"10.1145/800257.808921","DOIUrl":null,"url":null,"abstract":"This paper describes the techniques and results of an information retrieval system utilizing an IBM 7094 installation at Phillips Petroleum Company. The Statistical Information Retrieval (SIR) system employs a -and-ldquo;co-ordinate concept-and-rdquo; with a logic based on the statistical probability that a desired document will be abstracted with a relatively high percentage of the identical or synonym keywords used in posing an inquiry concerning the document subjects. This approach to retrieval allows the use of an unlimited vocabulary, thus eliminating the need for a dictionary or thesaurus. The SIR programming incorporates a unique computing technique, vector manipulations and a search strategy which permit the system to operate efficiently on a large-scale computer. A bibliography and a representative example of the SIR program input and output are contained in an Appendix.","PeriodicalId":167902,"journal":{"name":"Proceedings of the 1964 19th ACM national conference","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1964-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1964 19th ACM national conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/800257.808921","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper describes the techniques and results of an information retrieval system utilizing an IBM 7094 installation at Phillips Petroleum Company. The Statistical Information Retrieval (SIR) system employs a -and-ldquo;co-ordinate concept-and-rdquo; with a logic based on the statistical probability that a desired document will be abstracted with a relatively high percentage of the identical or synonym keywords used in posing an inquiry concerning the document subjects. This approach to retrieval allows the use of an unlimited vocabulary, thus eliminating the need for a dictionary or thesaurus. The SIR programming incorporates a unique computing technique, vector manipulations and a search strategy which permit the system to operate efficiently on a large-scale computer. A bibliography and a representative example of the SIR program input and output are contained in an Appendix.