Optimizing user profile matching: a text-based approach

Q2 Computer Science
Youcef Benkhedda, F. Azouaou
{"title":"Optimizing user profile matching: a text-based approach","authors":"Youcef Benkhedda, F. Azouaou","doi":"10.1080/1206212X.2023.2218244","DOIUrl":null,"url":null,"abstract":"The rapid expansion of social media platforms has made linking user profiles across various networks an essential aspect of maintaining a consistent identity. With 4.66 billion users reported to be in the Websphere, many are active on multiple social media platforms simultaneously. Identifying users across multiple platforms poses challenges in integrating user profiles from various sources. Different matching schemes have been suggested over the years based on different user profile features, but very little information has been uncovered about user-generated text as a unique attribute for user profile matching, which generally poses real challenges in real-world scenarios. As many users have insufficient text and the use of non-discrete text information makes the comparison operation between the two social networks of quadratic complexity. Our study examines the different existing literature schemes for matching user profile pairs based only on their generated textual content. We suggest and evaluate the effectiveness of a two stage matching approach based on Locality Sensitive Hashing clustering and nearest neighbor search. We also present other matching results of different user representations language models and matching schemes.","PeriodicalId":39673,"journal":{"name":"International Journal of Computers and Applications","volume":"1 1","pages":"403 - 412"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computers and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/1206212X.2023.2218244","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

The rapid expansion of social media platforms has made linking user profiles across various networks an essential aspect of maintaining a consistent identity. With 4.66 billion users reported to be in the Websphere, many are active on multiple social media platforms simultaneously. Identifying users across multiple platforms poses challenges in integrating user profiles from various sources. Different matching schemes have been suggested over the years based on different user profile features, but very little information has been uncovered about user-generated text as a unique attribute for user profile matching, which generally poses real challenges in real-world scenarios. As many users have insufficient text and the use of non-discrete text information makes the comparison operation between the two social networks of quadratic complexity. Our study examines the different existing literature schemes for matching user profile pairs based only on their generated textual content. We suggest and evaluate the effectiveness of a two stage matching approach based on Locality Sensitive Hashing clustering and nearest neighbor search. We also present other matching results of different user representations language models and matching schemes.
优化用户配置文件匹配:基于文本的方法
社交媒体平台的快速扩张使得在不同网络上链接用户资料成为保持一致身份的重要方面。据报道,Websphere中有46.6亿用户,其中许多用户同时活跃在多个社交媒体平台上。识别跨多个平台的用户在集成来自不同来源的用户配置文件时带来了挑战。多年来,人们根据不同的用户配置文件特征提出了不同的匹配方案,但关于用户生成文本作为用户配置文件匹配的唯一属性的信息很少,这在现实场景中通常会带来真正的挑战。由于许多用户文本不足,而非离散文本信息的使用使得两种社交网络之间的比较运算具有二次复杂度。我们的研究检查了不同的现有文献方案匹配用户配置文件对仅基于其生成的文本内容。我们提出并评估了一种基于位置敏感哈希聚类和最近邻搜索的两阶段匹配方法的有效性。我们还给出了不同用户表示、语言模型和匹配方案的匹配结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Computers and Applications
International Journal of Computers and Applications Computer Science-Computer Graphics and Computer-Aided Design
CiteScore
4.70
自引率
0.00%
发文量
20
期刊介绍: The International Journal of Computers and Applications (IJCA) is a unique platform for publishing novel ideas, research outcomes and fundamental advances in all aspects of Computer Science, Computer Engineering, and Computer Applications. This is a peer-reviewed international journal with a vision to provide the academic and industrial community a platform for presenting original research ideas and applications. IJCA welcomes four special types of papers in addition to the regular research papers within its scope: (a) Papers for which all results could be easily reproducible. For such papers, the authors will be asked to upload "instructions for reproduction'''', possibly with the source codes or stable URLs (from where the codes could be downloaded). (b) Papers with negative results. For such papers, the experimental setting and negative results must be presented in detail. Also, why the negative results are important for the research community must be explained clearly. The rationale behind this kind of paper is that this would help researchers choose the correct approaches to solve problems and avoid the (already worked out) failed approaches. (c) Detailed report, case study and literature review articles about innovative software / hardware, new technology, high impact computer applications and future development with sufficient background and subject coverage. (d) Special issue papers focussing on a particular theme with significant importance or papers selected from a relevant conference with sufficient improvement and new material to differentiate from the papers published in a conference proceedings.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信