{"title":"beeFormer:缩小推荐系统中语义相似性与交互相似性之间的差距","authors":"Vojtěch Vančura, Pavel Kordík, Milan Straka","doi":"arxiv-2409.10309","DOIUrl":null,"url":null,"abstract":"Recommender systems often use text-side information to improve their\npredictions, especially in cold-start or zero-shot recommendation scenarios,\nwhere traditional collaborative filtering approaches cannot be used. Many\napproaches to text-mining side information for recommender systems have been\nproposed over recent years, with sentence Transformers being the most prominent\none. However, these models are trained to predict semantic similarity without\nutilizing interaction data with hidden patterns specific to recommender\nsystems. In this paper, we propose beeFormer, a framework for training sentence\nTransformer models with interaction data. We demonstrate that our models\ntrained with beeFormer can transfer knowledge between datasets while\noutperforming not only semantic similarity sentence Transformers but also\ntraditional collaborative filtering methods. We also show that training on\nmultiple datasets from different domains accumulates knowledge in a single\nmodel, unlocking the possibility of training universal, domain-agnostic\nsentence Transformer models to mine text representations for recommender\nsystems. We release the source code, trained models, and additional details\nallowing replication of our experiments at\nhttps://github.com/recombee/beeformer.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems\",\"authors\":\"Vojtěch Vančura, Pavel Kordík, Milan Straka\",\"doi\":\"arxiv-2409.10309\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recommender systems often use text-side information to improve their\\npredictions, especially in cold-start or zero-shot recommendation scenarios,\\nwhere traditional collaborative filtering approaches cannot be used. Many\\napproaches to text-mining side information for recommender systems have been\\nproposed over recent years, with sentence Transformers being the most prominent\\none. However, these models are trained to predict semantic similarity without\\nutilizing interaction data with hidden patterns specific to recommender\\nsystems. In this paper, we propose beeFormer, a framework for training sentence\\nTransformer models with interaction data. We demonstrate that our models\\ntrained with beeFormer can transfer knowledge between datasets while\\noutperforming not only semantic similarity sentence Transformers but also\\ntraditional collaborative filtering methods. We also show that training on\\nmultiple datasets from different domains accumulates knowledge in a single\\nmodel, unlocking the possibility of training universal, domain-agnostic\\nsentence Transformer models to mine text representations for recommender\\nsystems. We release the source code, trained models, and additional details\\nallowing replication of our experiments at\\nhttps://github.com/recombee/beeformer.\",\"PeriodicalId\":501281,\"journal\":{\"name\":\"arXiv - CS - Information Retrieval\",\"volume\":\"16 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10309\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10309","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems
Recommender systems often use text-side information to improve their
predictions, especially in cold-start or zero-shot recommendation scenarios,
where traditional collaborative filtering approaches cannot be used. Many
approaches to text-mining side information for recommender systems have been
proposed over recent years, with sentence Transformers being the most prominent
one. However, these models are trained to predict semantic similarity without
utilizing interaction data with hidden patterns specific to recommender
systems. In this paper, we propose beeFormer, a framework for training sentence
Transformer models with interaction data. We demonstrate that our models
trained with beeFormer can transfer knowledge between datasets while
outperforming not only semantic similarity sentence Transformers but also
traditional collaborative filtering methods. We also show that training on
multiple datasets from different domains accumulates knowledge in a single
model, unlocking the possibility of training universal, domain-agnostic
sentence Transformer models to mine text representations for recommender
systems. We release the source code, trained models, and additional details
allowing replication of our experiments at
https://github.com/recombee/beeformer.