Identifying and Profiling User Interest over time using Social Data

Iqra Ali, M. Naeem
{"title":"Identifying and Profiling User Interest over time using Social Data","authors":"Iqra Ali, M. Naeem","doi":"10.1109/INMIC56986.2022.9972955","DOIUrl":null,"url":null,"abstract":"With immense population growth in recent years, social data is growing at a rapid pace, which in turn can prove to be a rich source of hidden information. This work focuses on identifying user interest in electronic products, especially smartphones, using social data. This will help electronic businesses in the personalized marketing of their products. From the literature, most of the existing approaches attempted to identify user interest based on their ratings. In our understanding, the contents of reviews are equally important in identifying people's interests. Therefore, in this paper, we proposed a framework that identifies user interests based on their reviews and their ratings. Moreover, it performs an analysis of the aforementioned reviews, and profiles user interest. To achieve this, we used website data, written in the Roman Urdu language. To the best of our knowledge, very limited research has been carried out on the Roman Urdu dataset, as it is considered a low-resource language. Concerning our methodology, we first performed topic modeling using Latent Dirichlet Allocation (LDA), Bidirectional Encoder Representations from Transformers (BERT), and a hybrid of both. Based on the identified topics, we performed user interest profiling based on the probabilities of each model/brand using the Top2Vec model. We compared our results of topic modeling using reviews and reviews plus ratings. For topic modeling, we measure coherence score which we observe 52% for the hybrid approach while 47% and 45% for “BERT” and “LDA” respectively. Finally, For topic modeling, we perform human-based validation by comparing human-identified topics with the ones identified by our model.","PeriodicalId":404424,"journal":{"name":"2022 24th International Multitopic Conference (INMIC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 24th International Multitopic Conference (INMIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INMIC56986.2022.9972955","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With immense population growth in recent years, social data is growing at a rapid pace, which in turn can prove to be a rich source of hidden information. This work focuses on identifying user interest in electronic products, especially smartphones, using social data. This will help electronic businesses in the personalized marketing of their products. From the literature, most of the existing approaches attempted to identify user interest based on their ratings. In our understanding, the contents of reviews are equally important in identifying people's interests. Therefore, in this paper, we proposed a framework that identifies user interests based on their reviews and their ratings. Moreover, it performs an analysis of the aforementioned reviews, and profiles user interest. To achieve this, we used website data, written in the Roman Urdu language. To the best of our knowledge, very limited research has been carried out on the Roman Urdu dataset, as it is considered a low-resource language. Concerning our methodology, we first performed topic modeling using Latent Dirichlet Allocation (LDA), Bidirectional Encoder Representations from Transformers (BERT), and a hybrid of both. Based on the identified topics, we performed user interest profiling based on the probabilities of each model/brand using the Top2Vec model. We compared our results of topic modeling using reviews and reviews plus ratings. For topic modeling, we measure coherence score which we observe 52% for the hybrid approach while 47% and 45% for “BERT” and “LDA” respectively. Finally, For topic modeling, we perform human-based validation by comparing human-identified topics with the ones identified by our model.
使用社交数据识别和分析用户兴趣
随着近年来人口的巨大增长,社会数据正在快速增长,这反过来又可以证明是一个丰富的隐藏信息来源。这项工作的重点是利用社交数据识别用户对电子产品,尤其是智能手机的兴趣。这将有助于电子企业对其产品进行个性化营销。从文献来看,大多数现有的方法都试图根据用户的评分来确定用户的兴趣。在我们的理解中,评论的内容对于确定人们的兴趣同样重要。因此,在本文中,我们提出了一个基于用户评论和评分来识别用户兴趣的框架。此外,它还对前面提到的评论进行分析,并对用户的兴趣进行分析。为了做到这一点,我们使用了用罗马乌尔都语写的网站数据。据我们所知,对罗马乌尔都语数据集进行了非常有限的研究,因为它被认为是一种低资源语言。关于我们的方法,我们首先使用潜在狄利克雷分配(LDA),变形金刚的双向编码器表示(BERT)以及两者的混合进行主题建模。基于识别的主题,我们使用Top2Vec模型基于每个模型/品牌的概率执行用户兴趣分析。我们使用评论和评论加评级来比较主题建模的结果。对于主题建模,我们测量了一致性得分,我们观察到混合方法的一致性得分为52%,而“BERT”和“LDA”的一致性得分分别为47%和45%。最后,对于主题建模,我们通过将人类识别的主题与我们的模型识别的主题进行比较来执行基于人类的验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信