Ren Li;Huazhong Liu;Xiaotong Zhou;Jiawei Wang;Jihong Ding;Laurence T. Yang;Hua Li;Yunfan Zhang
{"title":"Tucker-Based High-Accuracy Multi-Modal Clustering for Social Information Network","authors":"Ren Li;Huazhong Liu;Xiaotong Zhou;Jiawei Wang;Jihong Ding;Laurence T. Yang;Hua Li;Yunfan Zhang","doi":"10.1109/TBDATA.2024.3524830","DOIUrl":null,"url":null,"abstract":"With the explosion of social media platforms, a substantial amount of data is generated from social information network. Tensor-based multi-modal clustering methods have been widely applied in various scenarios of social information network by mining potential correlative relationships from large-scale heterogeneous data. Nevertheless, the accuracy and efficiency of tensor-based multi-modal clustering methods are seriously restricted by noise data and the curse of dimensionality. Therefore, this paper presents a Tucker-based multi-modal clustering (TuMC) and an improved TuMC (ITuMC) to enhance the accuracy and efficiency of multi-modal clustering. First, we propose two Tucker-based attribute weight ranking learning approaches to calculate weight tensor efficiently. Then, we present a calculation approach for Tucker-based selective weighted tensor distance (SWTD) and a TuMC method. Meanwhile, an ITuMC method is explored by optimizing the calculation efficiency of the SWTD to further improve clustering speed. Finally, we present a Tucker-based multi-modal clustering and service framework for social information network. Extensive experimental results based on social Geolife GPS trajectory and electricity consumption datasets demonstrate that the TuMC and ITuMC methods can cluster multi-source heterogeneous data with both higher accuracy and efficiency under complex social information network by DVI, AR and execution time measurement.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"1677-1691"},"PeriodicalIF":5.7000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10834502/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
With the explosion of social media platforms, a substantial amount of data is generated from social information network. Tensor-based multi-modal clustering methods have been widely applied in various scenarios of social information network by mining potential correlative relationships from large-scale heterogeneous data. Nevertheless, the accuracy and efficiency of tensor-based multi-modal clustering methods are seriously restricted by noise data and the curse of dimensionality. Therefore, this paper presents a Tucker-based multi-modal clustering (TuMC) and an improved TuMC (ITuMC) to enhance the accuracy and efficiency of multi-modal clustering. First, we propose two Tucker-based attribute weight ranking learning approaches to calculate weight tensor efficiently. Then, we present a calculation approach for Tucker-based selective weighted tensor distance (SWTD) and a TuMC method. Meanwhile, an ITuMC method is explored by optimizing the calculation efficiency of the SWTD to further improve clustering speed. Finally, we present a Tucker-based multi-modal clustering and service framework for social information network. Extensive experimental results based on social Geolife GPS trajectory and electricity consumption datasets demonstrate that the TuMC and ITuMC methods can cluster multi-source heterogeneous data with both higher accuracy and efficiency under complex social information network by DVI, AR and execution time measurement.
期刊介绍:
The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.