IEEE Transactions on Big Data最新文献

筛选
英文 中文
A Marketing Topic Traceability Model Based on Domain Preference and Heterogeneous Network 基于领域偏好和异构网络的营销主题追溯模型
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2025-01-01 DOI: 10.1109/TBDATA.2024.3524831
Tun Li;Di Lei;Qian Li;Rong Wang;Chaolong Jia;Yunpeng Xiao
{"title":"A Marketing Topic Traceability Model Based on Domain Preference and Heterogeneous Network","authors":"Tun Li;Di Lei;Qian Li;Rong Wang;Chaolong Jia;Yunpeng Xiao","doi":"10.1109/TBDATA.2024.3524831","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3524831","url":null,"abstract":"The development of social networks has prompted a shift in marketing strategies, with a surging demand for marketing in vertical domains characterized by high user stickiness and specialization. To address this, we propose a traceability model based on domain preference and heterogeneous networks. First, considering the problem of marketing topic vertical domains features metric and the influence of users’ preference degree for domains on topic propagation, the domains are treated as latent semantics, and the user-topic association matrix sparse matrix is densified using a latent factor model to mine the domain preference information efficiently. Second, considering the complexity of the association between multi-type elements in marketing topics, the HLN2vec (Heterogeneous Layer-wise Networks) model is proposed. This model uses heterogeneous network representation learning and incorporates multi-layer attention networks to learn the representations to portray a marketing topic’s key elements and their relationships. Finally, this paper proposes the DP-Rank(Domain Preference-based) algorithm, which uses domain preference features and an adaptive random walking strategy to quantify element influence. Based on experiments, the proposed model robustly applies in social networks and exhibits clear advantages in measuring vertical domain features of marketing topics, constructing multi-type element relationship networks, and discovering core element influence.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"1692-1706"},"PeriodicalIF":7.5,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Data-Centric $ell$ℓ-Diversity Model for Securely Publishing Personal Data With Enhanced Utility 一种以数据为中心的具有增强效用的安全发布个人数据的$ well $ $多样性模型
IF 5.7 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2025-01-01 DOI: 10.1109/TBDATA.2024.3524832
Abdul Majeed;Seong Oun Hwang
{"title":"A Data-Centric $ell$ℓ-Diversity Model for Securely Publishing Personal Data With Enhanced Utility","authors":"Abdul Majeed;Seong Oun Hwang","doi":"10.1109/TBDATA.2024.3524832","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3524832","url":null,"abstract":"In this paper, we propose and implement a novel anonymization model, called data-centric <inline-formula><tex-math>$ell$</tex-math></inline-formula>-diversity, to effectively safeguard the privacy of individuals with considerably enhanced utility in data publishing scenarios. Through experimental analysis of real-life datasets, we found that when the data quality is poor (e.g., distributions are uneven), most of the existing methods only anonymize some parts of the data (where distributions are balanced) and leave other parts unprocessed, which can lead to explicit privacy disclosures. Furthermore, they do not identify and repair problematic parts of the data before anonymization, and therefore, they are not secure from the threat of privacy breaches. To address these technical problems, in this paper, we implement an automated method that identifies vulnerabilities in the underlying data to be anonymized w.r.t. distribution, and that repairs them by injecting virtual samples of good quality. Later, we implement a data partitioning strategy that creates compact and diverse classes of size <inline-formula><tex-math>$k$</tex-math></inline-formula>, where <inline-formula><tex-math>$k$</tex-math></inline-formula> is the privacy parameter. Finally, only shallow generalization (or no generalization) is applied to each class to minimally generalize the data, whereas existing methods overly distort data by not improving the quality beforehand, which can lead to poor utility in data-driven services. We conducted detailed experiments on four datasets to justify the performance of our model in realistic scenarios, and achieved promising results from the perspectives of boosted accuracy, privacy preservation, data utility enrichment, and reduced computing overheads. Compared with baseline methods, our model enhanced privacy preservation by 36.56% on three different metrics, and data utility was augmented with 18.65% less information loss and 14.37% greater accuracy. Lastly, our model, on average, has shown a 26.13% reduction in time overheads compared to the SOTA baseline methods.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2278-2295"},"PeriodicalIF":5.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144989931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Collaborative Network-Based Retrieval Model for Open Source Domain Experts 基于协作网络的开源领域专家检索模型
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2025-01-01 DOI: 10.1109/TBDATA.2024.3524829
Qingxi Peng;Zhenjie Weng;Wei Wang;Xinyi Wang;Lan You
{"title":"A Collaborative Network-Based Retrieval Model for Open Source Domain Experts","authors":"Qingxi Peng;Zhenjie Weng;Wei Wang;Xinyi Wang;Lan You","doi":"10.1109/TBDATA.2024.3524829","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3524829","url":null,"abstract":"Aiming at the problem that the GitHub platform only supports the retrieval of developers through their usernames and it is difficult to directly obtain developers' expertise information, this paper proposes an open source domain expert retrieval model (OSDERM) based on the network representation learning algorithm OSC2vec (Open Source Collaboration to Vector). The model mainly consists of two core parts: Expert Profiling and Expert Finding. Expert Profiling aims to enrich the expertise information in the search results by labeling the expertise of developers; while Expert Finding achieves rapid location of the most suitable domain experts through keyword matching, which greatly saves the time and effort of searching for experts in the open source community. Experiments using the GitHub ecological dataset show that the model outperforms existing comparative algorithms in discovering open source domain experts, and can provide an effective reference for enterprise recruitment","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"1720-1732"},"PeriodicalIF":7.5,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Casformer: Information Popularity Prediction With Adaptive Cascade Sampling and Graph Transformer in Social Networks Casformer:社交网络中具有自适应级联采样和图转换器的信息流行度预测
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2025-01-01 DOI: 10.1109/TBDATA.2024.3524839
Biao Wang;Zhao Li;Zenghui Xu;Ji Zhang
{"title":"Casformer: Information Popularity Prediction With Adaptive Cascade Sampling and Graph Transformer in Social Networks","authors":"Biao Wang;Zhao Li;Zenghui Xu;Ji Zhang","doi":"10.1109/TBDATA.2024.3524839","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3524839","url":null,"abstract":"Predicting the popularity of information in social networks is crucial for effective social marketing and recommendation systems. However, accurately comprehending the complex dynamics of information diffusion remains a challenging task. Existing methods, including feature-based approaches, point process models, and deep learning techniques, often fail to capture the fine-grained features of information cascades, such as dynamic diffusion patterns, cascade statistics, and the interplay between spatial and temporal information. To address these limitations, we propose Casformer, a novel graph-based Transformer architecture that effectively learns both micro-level time-aware structural information and macro-level long-term influence along the information propagation process. Casformer employs a cascade attention network (CAT) to capture the micro-level features and a Transformer model to learn the macro-level influence. Furthermore, we introduce an adaptive cascade graph sampling strategy based on the temporal diffusion pattern and cascade statistics of information to obtain the most informative cascade graph sequence. By leveraging multi-level fine-grained evolving features of information cascades, Casformer achieves high accuracy in information popularity prediction. Experimental results on real-world social network and scientific citation network datasets demonstrate the effectiveness and superiority of Casformer compared to state-of-the-art methods in information popularity prediction.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"1652-1663"},"PeriodicalIF":7.5,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reducing Re-Indexing for Top-k Personalized PageRank Computation on Dynamic Graphs 减少动态图上Top-k个性化PageRank计算的重新索引
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2025-01-01 DOI: 10.1109/TBDATA.2024.3524833
Tsuyoshi Yamashita;Naoki Matsumoto;Kunitake Kaneko
{"title":"Reducing Re-Indexing for Top-k Personalized PageRank Computation on Dynamic Graphs","authors":"Tsuyoshi Yamashita;Naoki Matsumoto;Kunitake Kaneko","doi":"10.1109/TBDATA.2024.3524833","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3524833","url":null,"abstract":"Top-k Personalized PageRank (PPR) is a graph analysis method used to determine the <inline-formula><tex-math>$k$</tex-math></inline-formula> most important nodes with respect to a source node. To realize fast Top-k PPR computation, indexing for each node is effective. When we apply the index-based Top-k PPR methods to dynamic graphs, the index becomes stale with edge updates, and index correction is required. Although the existing methods perform index correction for every update to guarantee Top-k PPR accuracy, they involve heavy re-indexing computation or significant memory overhead. This paper proposes a method that achieves comparable accuracy to guaranteed methods while significantly reducing re-indexing by focusing on the fact that index references are concentrated on the nodes whose index is unlikely to change due to edge updates. In particular, our method omits re-indexing as long as we achieve comparable accuracy. Furthermore, our method involves the minimum memory overhead among the existing index-based methods. The space complexity of the index is <inline-formula><tex-math>$Theta (n + m)$</tex-math></inline-formula>, where <inline-formula><tex-math>$n$</tex-math></inline-formula> and <inline-formula><tex-math>$m$</tex-math></inline-formula> are the number of nodes and edges of the graph, respectively. The evaluation results using real-world datasets show that our method achieves more than 0.999 Normalized Discounted Cumulative Gain until 20% of edges are updated from index generation.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"1707-1719"},"PeriodicalIF":7.5,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10819623","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Information Switching Patterns of Risk Communication in Social Media During Disasters 灾害中社交媒体风险沟通的信息转换模式
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2025-01-01 DOI: 10.1109/TBDATA.2024.3524828
Khondhaker Al Momin;Arif Mohaimin Sadri;Kristin Olofsson;K.K. Muraleetharan;Hugh Gladwin
{"title":"Information Switching Patterns of Risk Communication in Social Media During Disasters","authors":"Khondhaker Al Momin;Arif Mohaimin Sadri;Kristin Olofsson;K.K. Muraleetharan;Hugh Gladwin","doi":"10.1109/TBDATA.2024.3524828","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3524828","url":null,"abstract":"In an era increasingly affected by natural and human-caused disasters, the role of social media in disaster communication has become ever more critical. Despite substantial research on social media use during crises, a significant gap remains in detecting crisis-related misinformation. Detecting deviations in information is fundamental for identifying and curbing the spread of misinformation. This study introduces a novel <italic>Information Switching Pattern Model</i> to identify dynamic shifts in perspectives among users who mention each other in crisis-related narratives on social media. These shifts serve as evidence of crisis misinformation affecting user-mention network interactions. The study utilizes advanced natural language processing, network science, and census data to analyze geotagged tweets related to compound disaster events in Oklahoma in 2022. The impact of misinformation is revealed by distinct engagement patterns among various user types, such as bots, private organizations, non-profits, government agencies, and news media throughout different disaster stages. These patterns show how different disasters influence public sentiment, highlight the heightened vulnerability of mobile home communities, and underscore the importance of education and transportation access in crisis response. Understanding these engagement patterns is crucial for detecting misinformation and leveraging social media as an effective tool for risk communication during disasters.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"1733-1744"},"PeriodicalIF":7.5,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10820023","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144606221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FinLLMs: A Framework for Financial Reasoning Dataset Generation With Large Language Models FinLLMs:一个使用大型语言模型生成金融推理数据集的框架
IF 5.7 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-12-30 DOI: 10.1109/TBDATA.2024.3524083
Ziqiang Yuan;Kaiyuan Wang;Shoutai Zhu;Ye Yuan;Jingya Zhou;Yanlin Zhu;Wenqi Wei
{"title":"FinLLMs: A Framework for Financial Reasoning Dataset Generation With Large Language Models","authors":"Ziqiang Yuan;Kaiyuan Wang;Shoutai Zhu;Ye Yuan;Jingya Zhou;Yanlin Zhu;Wenqi Wei","doi":"10.1109/TBDATA.2024.3524083","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3524083","url":null,"abstract":"Large Language models (LLMs) usually rely on extensive training datasets. In the financial domain, creating numerical reasoning datasets that include a mix of tables and long text often involves substantial manual annotation expenses. To address the limited data resources and reduce the annotation cost, we introduce FinLLMs, a method for generating financial question-answering (QA) data based on common financial formulas using LLMs. First, we compile a list of common financial formulas and construct a graph based on the variables these formulas employ. We then augment the formula set by combining those that share identical variables as new elements. Specifically, we explore formulas obtained by manual annotation and merge those formulas with shared variables by traversing the constructed graph. Finally, utilizing LLMs, we generate financial QA data that encompasses both tabular information and long textual content, building on the collected formula set. Our experiments demonstrate that the synthetic data generated by FinLLMs effectively enhances the performance of various numerical reasoning models in the financial domain, including both pre-trained language models (PLMs) and fine-tuned LLMs. This performance surpasses that of two established benchmark financial QA datasets.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 5","pages":"2264-2277"},"PeriodicalIF":5.7,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NAGphormer+: A Tokenized Graph Transformer With Neighborhood Augmentation for Node Classification in Large Graphs NAGphormer+:用于大图中节点分类的带有邻域增强的标记化图转换器
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-12-30 DOI: 10.1109/TBDATA.2024.3524081
Jinsong Chen;Chang Liu;Kaiyuan Gao;Gaichao Li;Kun He
{"title":"NAGphormer+: A Tokenized Graph Transformer With Neighborhood Augmentation for Node Classification in Large Graphs","authors":"Jinsong Chen;Chang Liu;Kaiyuan Gao;Gaichao Li;Kun He","doi":"10.1109/TBDATA.2024.3524081","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3524081","url":null,"abstract":"Graph Transformers, emerging as a new architecture for graph representation learning, suffer from the quadratic complexity and can only handle graphs with at most thousands of nodes. To this end, we propose a Neighborhood Aggregation Graph Transformer (NAGphormer) that treats each node as a sequence containing a series of tokens constructed by our proposed Hop2Token module. For each node, Hop2Token aggregates the neighborhood features from different hops into different representations, producing a sequence of token vectors as one input. In this way, NAGphormer could be trained in a mini-batch manner and thus could scale to large graphs with millions of nodes. To further enhance the model's generalization, we propose NAGphormer+, an extended model of NAGphormer with a novel data augmentation method called Neighborhood Augmentation (NrAug). Based on the output of Hop2Token, NrAug simultaneously augments the features of neighborhoods from global as well as local views. In this way, NAGphormer+ can fully utilize the neighborhood information of multiple nodes, thereby undergoing more comprehensive training and improving the model's generalization capability. Extensive experiments on benchmark datasets from small to large demonstrate the superiority of NAGphormer+ against existing graph Transformers and mainstream GNNs, as well as the original NAGphormer.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"2085-2098"},"PeriodicalIF":7.5,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated Multi-View Multi-Label Classification 联邦多视图多标签分类
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-12-26 DOI: 10.1109/TBDATA.2024.3522812
Hongdao Meng;Yongjian Deng;Qiyu Zhong;Yipeng Wang;Zhen Yang;Gengyu Lyu
{"title":"Federated Multi-View Multi-Label Classification","authors":"Hongdao Meng;Yongjian Deng;Qiyu Zhong;Yipeng Wang;Zhen Yang;Gengyu Lyu","doi":"10.1109/TBDATA.2024.3522812","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3522812","url":null,"abstract":"Multi-view multi-label classification is a crucial machine learning paradigm aimed at building robust multi-label predictors by integrating heterogeneous features from various sources while addressing multiple correlated labels. However, in real-world applications, concerns over data confidentiality and security often prevent data exchange or fusion across different sources, leading to the challenging issue of data islands. To tackle this problem, we propose a general federated multi-view multi-label classification method, FMVML, which integrates a novel multi-view multi-label classification technique into a federated learning framework. This approach enables cross-view feature fusion and multi-label semantic classification while preserving the data privacy of each independent source. Within this federated framework, we first extract view-specific information from each individual client to capture unique characteristics and then consolidate consensus information from different views on the global server to represent shared features. Unlike previous methods, our approach enhances cross-view fusion and semantic expression by jointly capturing both feature and semantic aspects of specificity and commonality. The final label predictions are generated by combining the view-specific predictions from individual clients and the consensus predictions from the global server. Extensive experiments across various applications demonstrate that FMVML fully leverages multi-view data in a privacy-preserving manner and consistently outperforms state-of-the-art methods.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"2072-2084"},"PeriodicalIF":7.5,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144598078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unlocking Large Language Model Power in Industry: Privacy-Preserving Collaborative Creation of Knowledge Graph 解锁工业中的大型语言模型力量:保护隐私的知识图谱协同创建
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-12-26 DOI: 10.1109/TBDATA.2024.3522814
Liqiao Xia;Junming Fan;Ajith Parlikad;Xiao Huang;Pai Zheng
{"title":"Unlocking Large Language Model Power in Industry: Privacy-Preserving Collaborative Creation of Knowledge Graph","authors":"Liqiao Xia;Junming Fan;Ajith Parlikad;Xiao Huang;Pai Zheng","doi":"10.1109/TBDATA.2024.3522814","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3522814","url":null,"abstract":"Semantic expertise remains a reliable foundation for industrial decision-making, while Large Language Models (LLMs) can augment the often limited empirical knowledge by generating domain-specific insights, though the quality of this generative knowledge is uncertain. Integrating LLMs with the collective wisdom of multiple stakeholders could enhance the quality and scale of knowledge, yet this integration might inadvertently raise privacy concerns for stakeholders. In response to this challenge, Federated Learning (FL) is harnessed to improve the knowledge base quality by cryptically leveraging other stakeholders’ knowledge, where knowledge base is represented in Knowledge Graph (KG) form. Initially, a multi-field hyperbolic (MFH) graph embedding method vectorizes entities, furnishing mathematical representations in lieu of solely semantic meanings. The FL framework subsequently encrypted identifies and fuses common entities, whereby the updated entities’ embedding can refine other private entities’ embedding locally, thus enhancing the overall KG quality. Finally, the KG complement method refines and clarifies triplets to improve the overall quality of the KG. An experiment assesses the proposed approach across different industrial KGs, confirming its effectiveness as a viable solution for collaborative KG creation, all while maintaining data security.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"2046-2060"},"PeriodicalIF":7.5,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信