{"title":"基于潜在语义索引的孟加拉文文献排序信息检索系统","authors":"Md. Nesarul Hoque, Rabiul Islam, Md. Sajidul Karim","doi":"10.1109/ICASERT.2019.8934837","DOIUrl":null,"url":null,"abstract":"Nowadays, like the English and other languages, Bangla also plays a significant role to strengthen the web repository. The storing rate of Bangla information is augmented day-by-day. Because of the numerous documents in the World Wide Web, it is very difficult for a user to retrieve the desired information. Furthermore, finding the useful documents tends to be more time spending as well as an annoying job. These demands emerge to develop an Information Retrieval (IR) system to document ranking for Bangla language. In this paper, we have built such a retrieval system where users can find their needed documents which correspond to their own query strings throughout the ranking index. Although a lot of works have been done for English and other languages to rank the documents, unfortunately, we have found a very negligible amount of contributions in Bangla Language. Many methods such as – Boolean model, Maximal Marginal Relevance (MMR), Portfolio Theory (PR), Quantum Probability Ranking Principle (QPRP), Query Directed Clustering (QDC), Vector-based TFIDF and so on, have been proposed to implement the document ranking system. Here, we have applied a new approach, called Latent Semantic Indexing (LSI) to do the same task for Bangla documents. LSI uses the mathematical method called Singular Value Decomposition (SVD). After that, we have applied the cosine similarity to rank all the documents. We believe that the performance result of our proposed system has reached the trustworthy level.","PeriodicalId":6613,"journal":{"name":"2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT)","volume":"35 1","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Information Retrieval System in Bangla Document Ranking using Latent Semantic Indexing\",\"authors\":\"Md. Nesarul Hoque, Rabiul Islam, Md. Sajidul Karim\",\"doi\":\"10.1109/ICASERT.2019.8934837\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, like the English and other languages, Bangla also plays a significant role to strengthen the web repository. The storing rate of Bangla information is augmented day-by-day. Because of the numerous documents in the World Wide Web, it is very difficult for a user to retrieve the desired information. Furthermore, finding the useful documents tends to be more time spending as well as an annoying job. These demands emerge to develop an Information Retrieval (IR) system to document ranking for Bangla language. In this paper, we have built such a retrieval system where users can find their needed documents which correspond to their own query strings throughout the ranking index. Although a lot of works have been done for English and other languages to rank the documents, unfortunately, we have found a very negligible amount of contributions in Bangla Language. Many methods such as – Boolean model, Maximal Marginal Relevance (MMR), Portfolio Theory (PR), Quantum Probability Ranking Principle (QPRP), Query Directed Clustering (QDC), Vector-based TFIDF and so on, have been proposed to implement the document ranking system. Here, we have applied a new approach, called Latent Semantic Indexing (LSI) to do the same task for Bangla documents. LSI uses the mathematical method called Singular Value Decomposition (SVD). After that, we have applied the cosine similarity to rank all the documents. We believe that the performance result of our proposed system has reached the trustworthy level.\",\"PeriodicalId\":6613,\"journal\":{\"name\":\"2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT)\",\"volume\":\"35 1\",\"pages\":\"1-5\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASERT.2019.8934837\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASERT.2019.8934837","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Information Retrieval System in Bangla Document Ranking using Latent Semantic Indexing
Nowadays, like the English and other languages, Bangla also plays a significant role to strengthen the web repository. The storing rate of Bangla information is augmented day-by-day. Because of the numerous documents in the World Wide Web, it is very difficult for a user to retrieve the desired information. Furthermore, finding the useful documents tends to be more time spending as well as an annoying job. These demands emerge to develop an Information Retrieval (IR) system to document ranking for Bangla language. In this paper, we have built such a retrieval system where users can find their needed documents which correspond to their own query strings throughout the ranking index. Although a lot of works have been done for English and other languages to rank the documents, unfortunately, we have found a very negligible amount of contributions in Bangla Language. Many methods such as – Boolean model, Maximal Marginal Relevance (MMR), Portfolio Theory (PR), Quantum Probability Ranking Principle (QPRP), Query Directed Clustering (QDC), Vector-based TFIDF and so on, have been proposed to implement the document ranking system. Here, we have applied a new approach, called Latent Semantic Indexing (LSI) to do the same task for Bangla documents. LSI uses the mathematical method called Singular Value Decomposition (SVD). After that, we have applied the cosine similarity to rank all the documents. We believe that the performance result of our proposed system has reached the trustworthy level.