{"title":"面向web交互环境下主题用户细分的行为模式聚类","authors":"Suma Srinath, Nagaraju Baydeti","doi":"10.1016/j.ins.2025.122745","DOIUrl":null,"url":null,"abstract":"<div><div>Clustering users based on their interest is a critical component in personalized content delivery. This paper proposes a novel multi modal framework that integrates semantic video classification, contextualized caption generation, and user behavior patterns. The system combines visual and audio features which are computed using convolutional and transformer based encoders to robustly capture the complex contents of video description. User browsing profile is modelled using probabilistic distributions to reflect realistic browsing behavior across six interest categories. These profiles are then clustered using KMeans, DBSCAN, and Agglomerative clustering to identify the various user groups. The quality of clustering is evaluated using Silhouttee Score, Davies-Bouldin Index, and Calinski-Harabasz Index, with PCA and t-SNE applied for visual validation of coherence of clusters. The simulation framework addresses the issues concerning data privacy and the scarcity of real world data by producing controllable and realistic user behavior traces. Experimental results demonstrate that KMeans provides the optimal trade-off between quality of clustering solution and computational cost. These integrated efforts bring personalized content delivery to a new perspective, i.e., fine-grained user segmentation and precise video understanding, respectively. The future work will focus on adopting real-time adaptive learning and integrating with more data types, and will further deploy on large-scale multimedia applications.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"724 ","pages":"Article 122745"},"PeriodicalIF":6.8000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Behavioral pattern clustering for thematic user segmentation in web interaction environments\",\"authors\":\"Suma Srinath, Nagaraju Baydeti\",\"doi\":\"10.1016/j.ins.2025.122745\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Clustering users based on their interest is a critical component in personalized content delivery. This paper proposes a novel multi modal framework that integrates semantic video classification, contextualized caption generation, and user behavior patterns. The system combines visual and audio features which are computed using convolutional and transformer based encoders to robustly capture the complex contents of video description. User browsing profile is modelled using probabilistic distributions to reflect realistic browsing behavior across six interest categories. These profiles are then clustered using KMeans, DBSCAN, and Agglomerative clustering to identify the various user groups. The quality of clustering is evaluated using Silhouttee Score, Davies-Bouldin Index, and Calinski-Harabasz Index, with PCA and t-SNE applied for visual validation of coherence of clusters. The simulation framework addresses the issues concerning data privacy and the scarcity of real world data by producing controllable and realistic user behavior traces. Experimental results demonstrate that KMeans provides the optimal trade-off between quality of clustering solution and computational cost. These integrated efforts bring personalized content delivery to a new perspective, i.e., fine-grained user segmentation and precise video understanding, respectively. The future work will focus on adopting real-time adaptive learning and integrating with more data types, and will further deploy on large-scale multimedia applications.</div></div>\",\"PeriodicalId\":51063,\"journal\":{\"name\":\"Information Sciences\",\"volume\":\"724 \",\"pages\":\"Article 122745\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0020025525008813\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025525008813","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Behavioral pattern clustering for thematic user segmentation in web interaction environments
Clustering users based on their interest is a critical component in personalized content delivery. This paper proposes a novel multi modal framework that integrates semantic video classification, contextualized caption generation, and user behavior patterns. The system combines visual and audio features which are computed using convolutional and transformer based encoders to robustly capture the complex contents of video description. User browsing profile is modelled using probabilistic distributions to reflect realistic browsing behavior across six interest categories. These profiles are then clustered using KMeans, DBSCAN, and Agglomerative clustering to identify the various user groups. The quality of clustering is evaluated using Silhouttee Score, Davies-Bouldin Index, and Calinski-Harabasz Index, with PCA and t-SNE applied for visual validation of coherence of clusters. The simulation framework addresses the issues concerning data privacy and the scarcity of real world data by producing controllable and realistic user behavior traces. Experimental results demonstrate that KMeans provides the optimal trade-off between quality of clustering solution and computational cost. These integrated efforts bring personalized content delivery to a new perspective, i.e., fine-grained user segmentation and precise video understanding, respectively. The future work will focus on adopting real-time adaptive learning and integrating with more data types, and will further deploy on large-scale multimedia applications.
期刊介绍:
Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions.
Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.