基于向量模型的词嵌入信息检索系统

J. Brundha, K. Meera
{"title":"基于向量模型的词嵌入信息检索系统","authors":"J. Brundha, K. Meera","doi":"10.1109/ICETET-SIP-2254415.2022.9791503","DOIUrl":null,"url":null,"abstract":"Vector based information retrieval system has been one of the trending methods in Natural Language Processing. The embeddings vector generated from a document helps in identifying most relevant document related to the query. There is various approach were embedding vectors can be generated and some of them which have implemented are Word2vec, Glove2vec and Sentence BERT. For information retrieval system also used word embedding transformation like PCA and Factor Analysis to improvise the model's performance. Most of information retrieval system involves getting query from the user, preprocessing of the query and generating most relevant information to the query. Results obtained by post processing methods such as PCA and Factor Analysis shows a comparatively better results with an increase of 2–3% of Mean average precision.","PeriodicalId":117229,"journal":{"name":"2022 10th International Conference on Emerging Trends in Engineering and Technology - Signal and Information Processing (ICETET-SIP-22)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Vector Model Based Information Retrieval System With Word Embedding Transformation\",\"authors\":\"J. Brundha, K. Meera\",\"doi\":\"10.1109/ICETET-SIP-2254415.2022.9791503\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Vector based information retrieval system has been one of the trending methods in Natural Language Processing. The embeddings vector generated from a document helps in identifying most relevant document related to the query. There is various approach were embedding vectors can be generated and some of them which have implemented are Word2vec, Glove2vec and Sentence BERT. For information retrieval system also used word embedding transformation like PCA and Factor Analysis to improvise the model's performance. Most of information retrieval system involves getting query from the user, preprocessing of the query and generating most relevant information to the query. Results obtained by post processing methods such as PCA and Factor Analysis shows a comparatively better results with an increase of 2–3% of Mean average precision.\",\"PeriodicalId\":117229,\"journal\":{\"name\":\"2022 10th International Conference on Emerging Trends in Engineering and Technology - Signal and Information Processing (ICETET-SIP-22)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 10th International Conference on Emerging Trends in Engineering and Technology - Signal and Information Processing (ICETET-SIP-22)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICETET-SIP-2254415.2022.9791503\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 10th International Conference on Emerging Trends in Engineering and Technology - Signal and Information Processing (ICETET-SIP-22)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICETET-SIP-2254415.2022.9791503","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

基于向量的信息检索系统已成为自然语言处理领域的发展趋势之一。从文档生成的嵌入向量有助于识别与查询相关的最相关文档。有多种方法可以生成嵌入向量,其中一些已经实现的是Word2vec, Glove2vec和Sentence BERT。对于信息检索系统,还采用了PCA和因子分析等词嵌入变换来改进模型的性能。大多数信息检索系统都涉及到从用户处获取查询、对查询进行预处理和生成与查询最相关的信息。通过主成分分析和因子分析等后处理方法获得了较好的结果,平均精度提高了2-3%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Vector Model Based Information Retrieval System With Word Embedding Transformation
Vector based information retrieval system has been one of the trending methods in Natural Language Processing. The embeddings vector generated from a document helps in identifying most relevant document related to the query. There is various approach were embedding vectors can be generated and some of them which have implemented are Word2vec, Glove2vec and Sentence BERT. For information retrieval system also used word embedding transformation like PCA and Factor Analysis to improvise the model's performance. Most of information retrieval system involves getting query from the user, preprocessing of the query and generating most relevant information to the query. Results obtained by post processing methods such as PCA and Factor Analysis shows a comparatively better results with an increase of 2–3% of Mean average precision.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信