beeFormer：缩小推荐系统中语义相似性与交互相似性之间的差距

arXiv - CS - Information Retrieval Pub Date : 2024-09-16 DOI:arxiv-2409.10309

Vojtěch Vančura, Pavel Kordík, Milan Straka

{"title":"beeFormer：缩小推荐系统中语义相似性与交互相似性之间的差距","authors":"Vojtěch Vančura, Pavel Kordík, Milan Straka","doi":"arxiv-2409.10309","DOIUrl":null,"url":null,"abstract":"Recommender systems often use text-side information to improve their\npredictions, especially in cold-start or zero-shot recommendation scenarios,\nwhere traditional collaborative filtering approaches cannot be used. Many\napproaches to text-mining side information for recommender systems have been\nproposed over recent years, with sentence Transformers being the most prominent\none. However, these models are trained to predict semantic similarity without\nutilizing interaction data with hidden patterns specific to recommender\nsystems. In this paper, we propose beeFormer, a framework for training sentence\nTransformer models with interaction data. We demonstrate that our models\ntrained with beeFormer can transfer knowledge between datasets while\noutperforming not only semantic similarity sentence Transformers but also\ntraditional collaborative filtering methods. We also show that training on\nmultiple datasets from different domains accumulates knowledge in a single\nmodel, unlocking the possibility of training universal, domain-agnostic\nsentence Transformer models to mine text representations for recommender\nsystems. We release the source code, trained models, and additional details\nallowing replication of our experiments at\nhttps://github.com/recombee/beeformer.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems\",\"authors\":\"Vojtěch Vančura, Pavel Kordík, Milan Straka\",\"doi\":\"arxiv-2409.10309\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recommender systems often use text-side information to improve their\\npredictions, especially in cold-start or zero-shot recommendation scenarios,\\nwhere traditional collaborative filtering approaches cannot be used. Many\\napproaches to text-mining side information for recommender systems have been\\nproposed over recent years, with sentence Transformers being the most prominent\\none. However, these models are trained to predict semantic similarity without\\nutilizing interaction data with hidden patterns specific to recommender\\nsystems. In this paper, we propose beeFormer, a framework for training sentence\\nTransformer models with interaction data. We demonstrate that our models\\ntrained with beeFormer can transfer knowledge between datasets while\\noutperforming not only semantic similarity sentence Transformers but also\\ntraditional collaborative filtering methods. We also show that training on\\nmultiple datasets from different domains accumulates knowledge in a single\\nmodel, unlocking the possibility of training universal, domain-agnostic\\nsentence Transformer models to mine text representations for recommender\\nsystems. We release the source code, trained models, and additional details\\nallowing replication of our experiments at\\nhttps://github.com/recombee/beeformer.\",\"PeriodicalId\":501281,\"journal\":{\"name\":\"arXiv - CS - Information Retrieval\",\"volume\":\"16 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10309\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10309","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

推荐系统经常使用文本边信息来改进其预测，尤其是在冷启动或零点推荐场景中，因为在这些场景中无法使用传统的协同过滤方法。近年来，人们提出了许多为推荐系统挖掘文本侧信息的方法，其中最著名的是句子转换器。然而，这些模型都是为预测语义相似性而训练的，没有利用具有推荐系统特有的隐藏模式的交互数据。在本文中，我们提出了用交互数据训练句子转换器模型的框架--beeFormer。我们证明，使用 beeFormer 训练的模型可以在数据集之间传递知识，同时不仅优于语义相似性句子转换器，也优于传统的协同过滤方法。我们还证明，在来自不同领域的多个数据集上进行训练可以在单个模型中积累知识，从而为训练通用的领域诊断句子转换器模型提供可能，为推荐系统挖掘文本表征。我们发布了源代码、训练好的模型和其他细节，允许在https://github.com/recombee/beeformer 复制我们的实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems

Recommender systems often use text-side information to improve their predictions, especially in cold-start or zero-shot recommendation scenarios, where traditional collaborative filtering approaches cannot be used. Many approaches to text-mining side information for recommender systems have been proposed over recent years, with sentence Transformers being the most prominent one. However, these models are trained to predict semantic similarity without utilizing interaction data with hidden patterns specific to recommender systems. In this paper, we propose beeFormer, a framework for training sentence Transformer models with interaction data. We demonstrate that our models trained with beeFormer can transfer knowledge between datasets while outperforming not only semantic similarity sentence Transformers but also traditional collaborative filtering methods. We also show that training on multiple datasets from different domains accumulates knowledge in a single model, unlocking the possibility of training universal, domain-agnostic sentence Transformer models to mine text representations for recommender systems. We release the source code, trained models, and additional details allowing replication of our experiments at https://github.com/recombee/beeformer.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Information Retrieval

自引率

0.00%

发文量