Exploiting a Determinant-Based Metric to Evaluate a Word-Embeddings Matrix of Items

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Pub Date : 2016-12-01 DOI:10.1109/ICDMW.2016.0143

Ludovico Boratto, S. Carta, G. Fenu, Roberto Saia

{"title":"Exploiting a Determinant-Based Metric to Evaluate a Word-Embeddings Matrix of Items","authors":"Ludovico Boratto, S. Carta, G. Fenu, Roberto Saia","doi":"10.1109/ICDMW.2016.0143","DOIUrl":null,"url":null,"abstract":"In order to generate effective results, it is essential for a recommender system to model the information about the user interests (user profiles). A profile usually contains preferences that reflect the recommendation technique, so collaborative systems represent a user with the ratings given to items, while content-based approaches assign a score to semantic/text-based features of the evaluated items. Even though semantic technologies are rapidly evolving and word embeddings (i.e., vector representations of the words in a corpus) are effective in numerous information filtering tasks, at the moment collaborative approaches (such as SVD) still generate more accurate recommendations. However, this might happen because, by employing classic profiles in form of vectors that collect all the preferences of a user, the power of word embeddings at modeling texts could be affected. In this paper we represent a profile as a matrix of word-embedding vectors of the items a user evaluated, and present a novel determinant-based metric that measures the similarity between an unevaluated item and those in the matrix-based user profile, in order to generate effective content-based recommendations. Experiments performed on three datasets show the capability of our approach to perform a better ranking of the items w.r.t. collaborative filtering, both when compared to a latent-factor-based approach (SVD) and to a classic neighborhood user-based system.","PeriodicalId":373866,"journal":{"name":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2016.0143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In order to generate effective results, it is essential for a recommender system to model the information about the user interests (user profiles). A profile usually contains preferences that reflect the recommendation technique, so collaborative systems represent a user with the ratings given to items, while content-based approaches assign a score to semantic/text-based features of the evaluated items. Even though semantic technologies are rapidly evolving and word embeddings (i.e., vector representations of the words in a corpus) are effective in numerous information filtering tasks, at the moment collaborative approaches (such as SVD) still generate more accurate recommendations. However, this might happen because, by employing classic profiles in form of vectors that collect all the preferences of a user, the power of word embeddings at modeling texts could be affected. In this paper we represent a profile as a matrix of word-embedding vectors of the items a user evaluated, and present a novel determinant-based metric that measures the similarity between an unevaluated item and those in the matrix-based user profile, in order to generate effective content-based recommendations. Experiments performed on three datasets show the capability of our approach to perform a better ranking of the items w.r.t. collaborative filtering, both when compared to a latent-factor-based approach (SVD) and to a classic neighborhood user-based system.

查看原文本刊更多论文

利用基于行列式的度量来评估项目的词嵌入矩阵

为了产生有效的结果，推荐系统必须对用户兴趣信息(用户档案)进行建模。配置文件通常包含反映推荐技术的偏好，因此协作系统代表用户对项目进行评分，而基于内容的方法为评估项目的基于语义/文本的特征分配分数。尽管语义技术正在迅速发展，词嵌入(即语料库中词的向量表示)在许多信息过滤任务中都是有效的，但目前协作方法(如SVD)仍然产生更准确的推荐。然而，这可能会发生，因为通过使用收集用户所有偏好的向量形式的经典配置文件，单词嵌入在建模文本时的能力可能会受到影响。在本文中，我们将个人资料表示为用户评估项目的词嵌入向量矩阵，并提出了一种新的基于确定性的度量，用于度量未评估项目与基于矩阵的用户个人资料中的项目之间的相似性，以便生成有效的基于内容的推荐。在三个数据集上进行的实验表明，无论是与基于潜在因素的方法(SVD)还是与经典的基于邻居用户的系统相比，我们的方法都能更好地对项目进行w.r.t.协同过滤排名。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

自引率

0.00%

发文量