Sylia Amarouche, Sabrina Mostefai, Fatiha Amirouche, Said Talbi
{"title":"Lucene中平滑扩展语言模型的新方法","authors":"Sylia Amarouche, Sabrina Mostefai, Fatiha Amirouche, Said Talbi","doi":"10.1109/ISCV54655.2022.9806072","DOIUrl":null,"url":null,"abstract":"This paper focuses on extending Information Retrieval (IR) API named Lucene by implementing various smoothing techniques. Lucene is a free API written entirely in JAVA. It allows to create an indexing and search engine for textual files. Our contribution has two parts: first, we propose a smoothing approach that we integrated into Lucene’s language model in order to improve its search abilities. The suggested approach is based on a combination of algorithms already implemented in Lucene. Next, we have implemented the Absolute Discount smoothing approach and integrated it into Lucene’s language model. Our proposed approaches have been evaluated in information retrieval on test collection. Our contribution yielded very good search results in some cases compared to other approaches implemented in Lucene.","PeriodicalId":426665,"journal":{"name":"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"New approach of smoothing to extend language model in Lucene\",\"authors\":\"Sylia Amarouche, Sabrina Mostefai, Fatiha Amirouche, Said Talbi\",\"doi\":\"10.1109/ISCV54655.2022.9806072\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper focuses on extending Information Retrieval (IR) API named Lucene by implementing various smoothing techniques. Lucene is a free API written entirely in JAVA. It allows to create an indexing and search engine for textual files. Our contribution has two parts: first, we propose a smoothing approach that we integrated into Lucene’s language model in order to improve its search abilities. The suggested approach is based on a combination of algorithms already implemented in Lucene. Next, we have implemented the Absolute Discount smoothing approach and integrated it into Lucene’s language model. Our proposed approaches have been evaluated in information retrieval on test collection. Our contribution yielded very good search results in some cases compared to other approaches implemented in Lucene.\",\"PeriodicalId\":426665,\"journal\":{\"name\":\"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCV54655.2022.9806072\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCV54655.2022.9806072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
New approach of smoothing to extend language model in Lucene
This paper focuses on extending Information Retrieval (IR) API named Lucene by implementing various smoothing techniques. Lucene is a free API written entirely in JAVA. It allows to create an indexing and search engine for textual files. Our contribution has two parts: first, we propose a smoothing approach that we integrated into Lucene’s language model in order to improve its search abilities. The suggested approach is based on a combination of algorithms already implemented in Lucene. Next, we have implemented the Absolute Discount smoothing approach and integrated it into Lucene’s language model. Our proposed approaches have been evaluated in information retrieval on test collection. Our contribution yielded very good search results in some cases compared to other approaches implemented in Lucene.