Hoang-Long Nguyen, Trong-Nhan Trinh-Huynh, Kim-Hung Le
{"title":"面向大数据环境的鲁棒可扩展信息检索框架","authors":"Hoang-Long Nguyen, Trong-Nhan Trinh-Huynh, Kim-Hung Le","doi":"10.1109/NICS56915.2022.10013446","DOIUrl":null,"url":null,"abstract":"The proliferation of information in cyberspace is increasing exponentially, leading to challenges for information retrieval systems to satisfy demands for performance and accuracy. How-ever, most existing works concentrate more on designing natural language processing (NLP) models than building such systems, which require massive efforts. In this study, we propose a modular framework for an information retrieval system consisting of several large-scale components capable of processing massive data. In addition, the proposed framework provides a high level of customization by assisting end-users in quickly replacing the NLP models to suit different contexts. This shortens the deployment from research to production of novel NLP models. The evaluation results of our prototype integrated with Vietnamese retrieval models show that the proposed framework is highly robust and scalable in big data contexts.","PeriodicalId":381028,"journal":{"name":"2022 9th NAFOSTED Conference on Information and Computer Science (NICS)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards a Robust and Scalable Information Retrieval Framework in Big Data Context\",\"authors\":\"Hoang-Long Nguyen, Trong-Nhan Trinh-Huynh, Kim-Hung Le\",\"doi\":\"10.1109/NICS56915.2022.10013446\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The proliferation of information in cyberspace is increasing exponentially, leading to challenges for information retrieval systems to satisfy demands for performance and accuracy. How-ever, most existing works concentrate more on designing natural language processing (NLP) models than building such systems, which require massive efforts. In this study, we propose a modular framework for an information retrieval system consisting of several large-scale components capable of processing massive data. In addition, the proposed framework provides a high level of customization by assisting end-users in quickly replacing the NLP models to suit different contexts. This shortens the deployment from research to production of novel NLP models. The evaluation results of our prototype integrated with Vietnamese retrieval models show that the proposed framework is highly robust and scalable in big data contexts.\",\"PeriodicalId\":381028,\"journal\":{\"name\":\"2022 9th NAFOSTED Conference on Information and Computer Science (NICS)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 9th NAFOSTED Conference on Information and Computer Science (NICS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NICS56915.2022.10013446\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 9th NAFOSTED Conference on Information and Computer Science (NICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NICS56915.2022.10013446","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards a Robust and Scalable Information Retrieval Framework in Big Data Context
The proliferation of information in cyberspace is increasing exponentially, leading to challenges for information retrieval systems to satisfy demands for performance and accuracy. How-ever, most existing works concentrate more on designing natural language processing (NLP) models than building such systems, which require massive efforts. In this study, we propose a modular framework for an information retrieval system consisting of several large-scale components capable of processing massive data. In addition, the proposed framework provides a high level of customization by assisting end-users in quickly replacing the NLP models to suit different contexts. This shortens the deployment from research to production of novel NLP models. The evaluation results of our prototype integrated with Vietnamese retrieval models show that the proposed framework is highly robust and scalable in big data contexts.