Examining the merits of feature-specific similarity functions in the news domain using human judgments

IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, CYBERNETICS
Alain D. Starke, Vegard R. Solberg, Sebastian Øverhaug, Christoph Trattner
{"title":"Examining the merits of feature-specific similarity functions in the news domain using human judgments","authors":"Alain D. Starke, Vegard R. Solberg, Sebastian Øverhaug, Christoph Trattner","doi":"10.1007/s11257-024-09412-2","DOIUrl":null,"url":null,"abstract":"<p>Online news article recommendations are typically of the ‘more like this’ type, generated by similarity functions. Across three studies, we examined the representativeness of different similarity functions for news item retrieval, by comparing them to human judgments of similarity. In Study 1 (<span>\\(N=401\\)</span>), participants assessed the overall similarity of ten randomly paired news articles on politics and compared their judgments to different feature-specific similarity functions (e.g., based on body text or images). In Study 2, we checked for domain differences in a mixed-methods survey (<span>\\(N=45\\)</span>), surfacing evidence that the effectiveness of similarity functions differs across different news categories (‘Recent Events’, ‘Sport’). In Study 3 (<span>\\(N=173\\)</span>), we improved the design of Study 1, by controlling for how news articles were matched, differentiating between dissimilar news articles and articles that were matched on a shared topic, named entities, and/or date of publication, across ‘Recent Events’ and ‘Sport’ categories. Across all studies, we found that users mostly used text-based features (e.g., body text, title) for their similarity judgments, while BodyText:TF-IDF was found to be the most representative for their judgments. Moreover, the strength of similarity judgments by humans and similarity scores by feature-specific functions was strongly affected by how news article pairs were matched. We show that humans and similarity functions are better aligned when two news articles are more alike, such as in a news recommendation scenario.</p>","PeriodicalId":49388,"journal":{"name":"User Modeling and User-Adapted Interaction","volume":null,"pages":null},"PeriodicalIF":3.0000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"User Modeling and User-Adapted Interaction","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11257-024-09412-2","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 0

Abstract

Online news article recommendations are typically of the ‘more like this’ type, generated by similarity functions. Across three studies, we examined the representativeness of different similarity functions for news item retrieval, by comparing them to human judgments of similarity. In Study 1 (\(N=401\)), participants assessed the overall similarity of ten randomly paired news articles on politics and compared their judgments to different feature-specific similarity functions (e.g., based on body text or images). In Study 2, we checked for domain differences in a mixed-methods survey (\(N=45\)), surfacing evidence that the effectiveness of similarity functions differs across different news categories (‘Recent Events’, ‘Sport’). In Study 3 (\(N=173\)), we improved the design of Study 1, by controlling for how news articles were matched, differentiating between dissimilar news articles and articles that were matched on a shared topic, named entities, and/or date of publication, across ‘Recent Events’ and ‘Sport’ categories. Across all studies, we found that users mostly used text-based features (e.g., body text, title) for their similarity judgments, while BodyText:TF-IDF was found to be the most representative for their judgments. Moreover, the strength of similarity judgments by humans and similarity scores by feature-specific functions was strongly affected by how news article pairs were matched. We show that humans and similarity functions are better aligned when two news articles are more alike, such as in a news recommendation scenario.

Abstract Image

利用人工判断检验新闻领域特定特征相似性函数的优点
在线新闻文章推荐通常是由相似性函数生成的 "更像这样 "类型。在三项研究中,我们通过将不同的相似性函数与人类对相似性的判断进行比较,检验了不同相似性函数在新闻项目检索中的代表性。在研究 1((N=401))中,参与者评估了十篇随机配对的政治新闻文章的整体相似性,并将他们的判断与不同特征的相似性函数(例如,基于正文或图像)进行了比较。在研究 2 中,我们在一项混合方法调查(\(N=45\))中检验了领域差异,发现了不同新闻类别("最新事件"、"体育")中相似性函数的有效性不同的证据。在研究 3((N=173))中,我们改进了研究 1 的设计,控制了新闻文章的匹配方式,在 "近期事件 "和 "体育 "类别中区分了不相似的新闻文章和在共同主题、命名实体和/或发布日期上匹配的文章。在所有研究中,我们发现用户大多使用基于文本的特征(如正文、标题)来进行相似性判断,而 BodyText:TF-IDF 被认为是最能代表用户判断的特征。此外,人的相似性判断和特定特征函数的相似性得分的强度受到新闻文章配对方式的强烈影响。我们的研究表明,当两篇新闻文章相似度较高时,例如在新闻推荐场景中,人类和相似度函数的一致性会更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
User Modeling and User-Adapted Interaction
User Modeling and User-Adapted Interaction 工程技术-计算机:控制论
CiteScore
8.90
自引率
8.30%
发文量
35
审稿时长
>12 weeks
期刊介绍: User Modeling and User-Adapted Interaction provides an interdisciplinary forum for the dissemination of novel and significant original research results about interactive computer systems that can adapt themselves to their users, and on the design, use, and evaluation of user models for adaptation. The journal publishes high-quality original papers from, e.g., the following areas: acquisition and formal representation of user models; conceptual models and user stereotypes for personalization; student modeling and adaptive learning; models of groups of users; user model driven personalised information discovery and retrieval; recommender systems; adaptive user interfaces and agents; adaptation for accessibility and inclusion; generic user modeling systems and tools; interoperability of user models; personalization in areas such as; affective computing; ubiquitous and mobile computing; language based interactions; multi-modal interactions; virtual and augmented reality; social media and the Web; human-robot interaction; behaviour change interventions; personalized applications in specific domains; privacy, accountability, and security of information for personalization; responsible adaptation: fairness, accountability, explainability, transparency and control; methods for the design and evaluation of user models and adaptive systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信