Behavioral pattern clustering for thematic user segmentation in web interaction environments

IF 6.8 1区 计算机科学 0 COMPUTER SCIENCE, INFORMATION SYSTEMS
Suma Srinath, Nagaraju Baydeti
{"title":"Behavioral pattern clustering for thematic user segmentation in web interaction environments","authors":"Suma Srinath,&nbsp;Nagaraju Baydeti","doi":"10.1016/j.ins.2025.122745","DOIUrl":null,"url":null,"abstract":"<div><div>Clustering users based on their interest is a critical component in personalized content delivery. This paper proposes a novel multi modal framework that integrates semantic video classification, contextualized caption generation, and user behavior patterns. The system combines visual and audio features which are computed using convolutional and transformer based encoders to robustly capture the complex contents of video description. User browsing profile is modelled using probabilistic distributions to reflect realistic browsing behavior across six interest categories. These profiles are then clustered using KMeans, DBSCAN, and Agglomerative clustering to identify the various user groups. The quality of clustering is evaluated using Silhouttee Score, Davies-Bouldin Index, and Calinski-Harabasz Index, with PCA and t-SNE applied for visual validation of coherence of clusters. The simulation framework addresses the issues concerning data privacy and the scarcity of real world data by producing controllable and realistic user behavior traces. Experimental results demonstrate that KMeans provides the optimal trade-off between quality of clustering solution and computational cost. These integrated efforts bring personalized content delivery to a new perspective, i.e., fine-grained user segmentation and precise video understanding, respectively. The future work will focus on adopting real-time adaptive learning and integrating with more data types, and will further deploy on large-scale multimedia applications.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"724 ","pages":"Article 122745"},"PeriodicalIF":6.8000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025525008813","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Clustering users based on their interest is a critical component in personalized content delivery. This paper proposes a novel multi modal framework that integrates semantic video classification, contextualized caption generation, and user behavior patterns. The system combines visual and audio features which are computed using convolutional and transformer based encoders to robustly capture the complex contents of video description. User browsing profile is modelled using probabilistic distributions to reflect realistic browsing behavior across six interest categories. These profiles are then clustered using KMeans, DBSCAN, and Agglomerative clustering to identify the various user groups. The quality of clustering is evaluated using Silhouttee Score, Davies-Bouldin Index, and Calinski-Harabasz Index, with PCA and t-SNE applied for visual validation of coherence of clusters. The simulation framework addresses the issues concerning data privacy and the scarcity of real world data by producing controllable and realistic user behavior traces. Experimental results demonstrate that KMeans provides the optimal trade-off between quality of clustering solution and computational cost. These integrated efforts bring personalized content delivery to a new perspective, i.e., fine-grained user segmentation and precise video understanding, respectively. The future work will focus on adopting real-time adaptive learning and integrating with more data types, and will further deploy on large-scale multimedia applications.
面向web交互环境下主题用户细分的行为模式聚类
根据用户的兴趣对他们进行聚类是个性化内容交付的关键组成部分。本文提出了一种集成了语义视频分类、语境化字幕生成和用户行为模式的新型多模态框架。该系统将视频和音频特征结合起来,使用基于卷积和变压器的编码器进行计算,以鲁棒地捕获视频描述的复杂内容。使用概率分布对用户浏览配置文件进行建模,以反映跨六个兴趣类别的实际浏览行为。然后使用KMeans、DBSCAN和Agglomerative集群对这些概要文件进行集群,以识别各种用户组。采用Silhouttee评分、Davies-Bouldin指数和Calinski-Harabasz指数对聚类质量进行评价,采用PCA和t-SNE对聚类的一致性进行视觉验证。仿真框架通过生成可控和真实的用户行为轨迹,解决了有关数据隐私和真实世界数据稀缺性的问题。实验结果表明,KMeans在聚类解决方案的质量和计算成本之间提供了最优的权衡。这些整合的努力将个性化内容交付带入了一个新的视角,即细粒度的用户细分和精确的视频理解。未来的工作将侧重于采用实时自适应学习和集成更多的数据类型,并将进一步部署在大规模多媒体应用上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Sciences
Information Sciences 工程技术-计算机:信息系统
CiteScore
14.00
自引率
17.30%
发文量
1322
审稿时长
10.4 months
期刊介绍: Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions. Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信