Sylia Amarouche, Sabrina Mostefai, Fatiha Amirouche, Said Talbi
{"title":"New approach of smoothing to extend language model in Lucene","authors":"Sylia Amarouche, Sabrina Mostefai, Fatiha Amirouche, Said Talbi","doi":"10.1109/ISCV54655.2022.9806072","DOIUrl":null,"url":null,"abstract":"This paper focuses on extending Information Retrieval (IR) API named Lucene by implementing various smoothing techniques. Lucene is a free API written entirely in JAVA. It allows to create an indexing and search engine for textual files. Our contribution has two parts: first, we propose a smoothing approach that we integrated into Lucene’s language model in order to improve its search abilities. The suggested approach is based on a combination of algorithms already implemented in Lucene. Next, we have implemented the Absolute Discount smoothing approach and integrated it into Lucene’s language model. Our proposed approaches have been evaluated in information retrieval on test collection. Our contribution yielded very good search results in some cases compared to other approaches implemented in Lucene.","PeriodicalId":426665,"journal":{"name":"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCV54655.2022.9806072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper focuses on extending Information Retrieval (IR) API named Lucene by implementing various smoothing techniques. Lucene is a free API written entirely in JAVA. It allows to create an indexing and search engine for textual files. Our contribution has two parts: first, we propose a smoothing approach that we integrated into Lucene’s language model in order to improve its search abilities. The suggested approach is based on a combination of algorithms already implemented in Lucene. Next, we have implemented the Absolute Discount smoothing approach and integrated it into Lucene’s language model. Our proposed approaches have been evaluated in information retrieval on test collection. Our contribution yielded very good search results in some cases compared to other approaches implemented in Lucene.