改善求职匹配预测和互惠推荐的最佳方法

arXiv - CS - Information Retrieval Pub Date : 2024-09-17 DOI:arxiv-2409.10992

Shuhei Goda, Yudai Hayashi, Yuta Saito

{"title":"改善求职匹配预测和互惠推荐的最佳方法","authors":"Shuhei Goda, Yudai Hayashi, Yuta Saito","doi":"arxiv-2409.10992","DOIUrl":null,"url":null,"abstract":"Matching users with mutual preferences is a critical aspect of services\ndriven by reciprocal recommendations, such as job search. To produce\nrecommendations in such scenarios, one can predict match probabilities and\nconstruct rankings based on these predictions. However, this direct match\nprediction approach often underperforms due to the extreme sparsity of match\nlabels. Therefore, most existing methods predict preferences separately for\neach direction (e.g., job seeker to employer and employer to job seeker) and\nthen aggregate the predictions to generate overall matching scores and produce\nrecommendations. However, this typical approach often leads to practical\nissues, such as biased error propagation between the two models. This paper\nintroduces and demonstrates a novel and practical solution to improve\nreciprocal recommendations in production by leveraging \\textit{pseudo-match\nscores}. Specifically, our approach generates dense and more directly relevant\npseudo-match scores by combining the true match labels, which are accurate but\nsparse, with relatively inaccurate but dense match predictions. We then train a\nmeta-model to output the final match predictions by minimizing the prediction\nloss against the pseudo-match scores. Our method can be seen as a\n\\textbf{best-of-both (BoB) approach}, as it combines the high-level ideas of\nboth direct match prediction and the two separate models approach. It also\nallows for user-specific weights to construct \\textit{personalized}\npseudo-match scores, achieving even better matching performance through\nappropriate tuning of the weights. Offline experiments on real-world job search\ndata demonstrate the superior performance of our BoB method, particularly with\npersonalized pseudo-match scores, compared to existing approaches in terms of\nfinding potential matches.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"67 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Best-of-Both Approach to Improve Match Predictions and Reciprocal Recommendations for Job Search\",\"authors\":\"Shuhei Goda, Yudai Hayashi, Yuta Saito\",\"doi\":\"arxiv-2409.10992\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Matching users with mutual preferences is a critical aspect of services\\ndriven by reciprocal recommendations, such as job search. To produce\\nrecommendations in such scenarios, one can predict match probabilities and\\nconstruct rankings based on these predictions. However, this direct match\\nprediction approach often underperforms due to the extreme sparsity of match\\nlabels. Therefore, most existing methods predict preferences separately for\\neach direction (e.g., job seeker to employer and employer to job seeker) and\\nthen aggregate the predictions to generate overall matching scores and produce\\nrecommendations. However, this typical approach often leads to practical\\nissues, such as biased error propagation between the two models. This paper\\nintroduces and demonstrates a novel and practical solution to improve\\nreciprocal recommendations in production by leveraging \\\\textit{pseudo-match\\nscores}. Specifically, our approach generates dense and more directly relevant\\npseudo-match scores by combining the true match labels, which are accurate but\\nsparse, with relatively inaccurate but dense match predictions. We then train a\\nmeta-model to output the final match predictions by minimizing the prediction\\nloss against the pseudo-match scores. Our method can be seen as a\\n\\\\textbf{best-of-both (BoB) approach}, as it combines the high-level ideas of\\nboth direct match prediction and the two separate models approach. It also\\nallows for user-specific weights to construct \\\\textit{personalized}\\npseudo-match scores, achieving even better matching performance through\\nappropriate tuning of the weights. Offline experiments on real-world job search\\ndata demonstrate the superior performance of our BoB method, particularly with\\npersonalized pseudo-match scores, compared to existing approaches in terms of\\nfinding potential matches.\",\"PeriodicalId\":501281,\"journal\":{\"name\":\"arXiv - CS - Information Retrieval\",\"volume\":\"67 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10992\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

匹配具有共同偏好的用户是求职等以互惠推荐为驱动的服务的一个重要方面。为了在这种情况下生成推荐，我们可以预测匹配概率，并根据这些预测构建排名。然而，由于匹配标签极其稀少，这种直接的匹配预测方法往往表现不佳。因此，现有的大多数方法都是分别预测每个方向（例如求职者对雇主和雇主对求职者）的偏好，然后汇总预测结果，生成总体匹配分数并提出建议。然而，这种典型的方法往往会导致一些实际问题，比如两个模型之间有偏差的误差传播。本文介绍并演示了一种新颖实用的解决方案，即利用文本{伪匹配分数}来改进生产中的互惠推荐。具体来说，我们的方法通过将准确但稀疏的真实匹配标签与相对不准确但密集的匹配预测相结合，生成密集且更直接相关的伪匹配分数。然后，我们训练一个模型，通过最小化与伪匹配分数的预测损失来输出最终的匹配预测结果。我们的方法可以看作是一种文本方法，因为它结合了直接匹配预测和两个独立模型方法的高层次思想。它还允许使用用户特定的权重来构建 "文本{个性化}伪匹配分数"，从而通过对权重的适当调整获得更好的匹配性能。在真实世界的求职数据上进行的离线实验证明，与现有方法相比，我们的BoB方法（尤其是使用个性化伪匹配分数的方法）在寻找潜在匹配者方面具有更优越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Best-of-Both Approach to Improve Match Predictions and Reciprocal Recommendations for Job Search

Matching users with mutual preferences is a critical aspect of services driven by reciprocal recommendations, such as job search. To produce recommendations in such scenarios, one can predict match probabilities and construct rankings based on these predictions. However, this direct match prediction approach often underperforms due to the extreme sparsity of match labels. Therefore, most existing methods predict preferences separately for each direction (e.g., job seeker to employer and employer to job seeker) and then aggregate the predictions to generate overall matching scores and produce recommendations. However, this typical approach often leads to practical issues, such as biased error propagation between the two models. This paper introduces and demonstrates a novel and practical solution to improve reciprocal recommendations in production by leveraging \textit{pseudo-match scores}. Specifically, our approach generates dense and more directly relevant pseudo-match scores by combining the true match labels, which are accurate but sparse, with relatively inaccurate but dense match predictions. We then train a meta-model to output the final match predictions by minimizing the prediction loss against the pseudo-match scores. Our method can be seen as a \textbf{best-of-both (BoB) approach}, as it combines the high-level ideas of both direct match prediction and the two separate models approach. It also allows for user-specific weights to construct \textit{personalized} pseudo-match scores, achieving even better matching performance through appropriate tuning of the weights. Offline experiments on real-world job search data demonstrate the superior performance of our BoB method, particularly with personalized pseudo-match scores, compared to existing approaches in terms of finding potential matches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Information Retrieval

自引率

0.00%

发文量