IEEE Transactions on Big Data最新文献

筛选
英文 中文
Fine-Tuned Personality Federated Learning for Graph Data 针对图形数据的微调个性联合学习
IF 7.2 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-01-19 DOI: 10.1109/TBDATA.2024.3356388
Meiting Xue;Zian Zhou;Pengfei Jiao;Huijun Tang
{"title":"Fine-Tuned Personality Federated Learning for Graph Data","authors":"Meiting Xue;Zian Zhou;Pengfei Jiao;Huijun Tang","doi":"10.1109/TBDATA.2024.3356388","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3356388","url":null,"abstract":"Federated Learning (FL) empowers multiple clients to collaboratively learn a global generalization model without the need to share their local data, thus reducing privacy risks and expanding the scope of AI applications. However, current works focus less on data in a highly nonidentically distributed manner such as graph data which are common in reality, and ignore the problem of model personalization between clients for graph data training in federated learning. In this paper, we propose a novel personality graph federated learning framework based on variational graph autoencoders that incorporates model contrastive learning and local fine-tuning to achieve personalized federated training on graph data for each client, which is called FedVGAE. Then we introduce an encoder-sharing strategy to the proposed framework that shares the parameters of the encoder layer to further improve personality performance. The node classification and link prediction experiments demonstrate that our method achieves better performance than other federated learning methods on most graph datasets in the non-iid setting. Finally, we conduct ablation experiments, the result demonstrates the effectiveness of our proposed method.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 3","pages":"313-319"},"PeriodicalIF":7.2,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140924721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LGRL: Local-Global Representation Learning for On-the-Fly FG-SBIR LGRL:用于即时 FG-SBIR 的局部-全局表征学习
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-01-19 DOI: 10.1109/TBDATA.2024.3356393
Dawei Dai;Yingge Liu;Yutang Li;Shiyu Fu;Shuyin Xia;Guoyin Wang
{"title":"LGRL: Local-Global Representation Learning for On-the-Fly FG-SBIR","authors":"Dawei Dai;Yingge Liu;Yutang Li;Shiyu Fu;Shuyin Xia;Guoyin Wang","doi":"10.1109/TBDATA.2024.3356393","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3356393","url":null,"abstract":"On-the-fly Fine-grained sketch-based image retrieval (On-the-fly FG-SBIR) framework aim to break the barriers that sketch drawing requires excellent skills and is time-consuming. Considering such problems, a partial sketch with fewer strokes contains only the little local information, and the drawing process may show great difference among users, resulting in poor performance at the early retrieval. In this study, we developed a local-global representation learning (LGRL) method, in which we learn the representations for both the local and global regions of the partial sketch and its target photos. Specifically, we first designed a triplet network to learn the joint embedding space shared between the local and global regions of the entire sketch and its corresponding region of the photo. Then, we divided each partial sketch in the sketch-drawing episode into several local regions; Another learnable module following the triplet network was designed to learn the representations for the local regions of the partial sketch. Finally, by combining both the local and global regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval efficiency on two publicly sketch-retrieval datasets and the practice test.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 4","pages":"543-555"},"PeriodicalIF":7.5,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141602527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GAT-COBO: Cost-Sensitive Graph Neural Network for Telecom Fraud Detection GAT-COBO:用于电信欺诈检测的成本敏感图神经网络
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-01-11 DOI: 10.1109/TBDATA.2024.3352978
Xinxin Hu;Haotian Chen;Junjie Zhang;Hongchang Chen;Shuxin Liu;Xing Li;Yahui Wang;Xiangyang Xue
{"title":"GAT-COBO: Cost-Sensitive Graph Neural Network for Telecom Fraud Detection","authors":"Xinxin Hu;Haotian Chen;Junjie Zhang;Hongchang Chen;Shuxin Liu;Xing Li;Yahui Wang;Xiangyang Xue","doi":"10.1109/TBDATA.2024.3352978","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3352978","url":null,"abstract":"Along with the rapid evolution of mobile communication technologies, such as 5G, there has been a significant increase in telecom fraud, which severely dissipates individual fortune and social wealth. In recent years, graph mining techniques are gradually becoming a mainstream solution for detecting telecom fraud. However, the graph imbalance problem, caused by the Pareto principle, brings severe challenges to graph data mining. This emerging and complex issue has received limited attention in prior research. In this paper, we propose a \u0000<underline>G</u>\u0000raph \u0000<underline>AT</u>\u0000tention network with \u0000<underline>CO</u>\u0000st-sensitive \u0000<underline>BO</u>\u0000osting (GAT-COBO) for the graph imbalance problem. First, we design a GAT-based base classifier to learn the embeddings of all nodes in the graph. Then, we feed the embeddings into a well-designed cost-sensitive learner for imbalanced learning. Next, we update the weights according to the misclassification cost to make the model focus more on the minority class. Finally, we sum the node embeddings obtained by multiple cost-sensitive learners to obtain a comprehensive node representation, which is used for the downstream anomaly detection task. Extensive experiments on two real-world telecom fraud detection datasets demonstrate that our proposed method is effective for the graph imbalance problem, outperforming the state-of-the-art GNNs and GNN-based fraud detectors. In addition, our model is also helpful for solving the widespread over-smoothing problem in GNNs.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 4","pages":"528-542"},"PeriodicalIF":7.5,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141602526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Core Maintenance on Dynamic Graphs: A Distributed Approach Built on H-Index 动态图上的核心维护:基于 H-Index 的分布式方法
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-01-11 DOI: 10.1109/TBDATA.2024.3352973
Qiang-Sheng Hua;Hongen Wang;Hai Jin;Xuanhua Shi
{"title":"Core Maintenance on Dynamic Graphs: A Distributed Approach Built on H-Index","authors":"Qiang-Sheng Hua;Hongen Wang;Hai Jin;Xuanhua Shi","doi":"10.1109/TBDATA.2024.3352973","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3352973","url":null,"abstract":"Core number is an essential tool for analyzing graph structure. Graphs in the real world are typically large and dynamic, requiring the development of distributed algorithms to refrain from expensive I/O operations and the maintenance algorithms to address dynamism. Core maintenance updates the core number of each vertex upon the insertion/deletion of vertices/edges. Although the state-of-the-art distributed maintenance algorithm (Weng et al.~2022) can handle multiple edge insertions/deletions simultaneously, it still has two aspects to improve. (I) Parallel processing is not allowed when inserting/removing edges with the same core number, reducing the degree of parallelism and raising the number of rounds. (II) During the implementation phase, only one thread is assigned to the vertices with the same core number, leading to the inability to fully utilize the distributed computing power. Furthermore, the h-index (Lü, et al. 2016) based distributed core decomposition algorithm (Montresor et al. 2013) can fully utilize the distributed computing power where all vertices can be processed in parallel. However, it requires all vertices to recompute their core numbers upon graph changes. In this article, we propose a distributed core maintenance algorithm based on h-index, which circumvents the issues of algorithm (Weng et al.~2022). In addition, our algorithm avoids core numbers recalculation where the numbers do not change. In comparison to the state-of-the-art distributed maintenance algorithm (Weng et al.~2022), the time speedup ratio is at least 100 in the scenarios of both insertion and deletion. Compared to the distributed core decomposition algorithm (Montresor et al. 2013), the average time speedup ratios are 2 and 8 for the cases of insertion and deletion, respectively.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 5","pages":"595-608"},"PeriodicalIF":7.5,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10388383","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142130280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online Heterogeneous Streaming Feature Selection Without Feature Type Information 无特征类型信息的在线异构流特征选择
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-01-08 DOI: 10.1109/TBDATA.2024.3350630
Peng Zhou;Yunyun Zhang;Zhaolong Ling;Yuanting Yan;Shu Zhao;Xindong Wu
{"title":"Online Heterogeneous Streaming Feature Selection Without Feature Type Information","authors":"Peng Zhou;Yunyun Zhang;Zhaolong Ling;Yuanting Yan;Shu Zhao;Xindong Wu","doi":"10.1109/TBDATA.2024.3350630","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3350630","url":null,"abstract":"Feature selection aims to select an optimal minimal feature subset from the original datasets and has become an indispensable preprocessing component before data mining and machine learning, especially in the era of Big Data. However, features may be generated dynamically and arrive individually over time in practice, which we call streaming features. Most existing streaming feature selection methods assume that all dynamically generated features are the same type or assume we can know the feature type for each new arriving feature in advance, but this is unreasonable and unrealistic. Therefore, this paper first studies a practical issue of Online Heterogeneous Streaming Feature Selection without the feature type information before learning, named OHSFS. Specifically, we first model the streaming feature selection issue as a minimax problem. Then, in terms of MIC (Maximal Information Coefficient), we derive a new metric \u0000<inline-formula><tex-math>$MIC_{Gain}$</tex-math></inline-formula>\u0000 to determine whether a new streaming feature should be selected. To speed up the efficiency of OHSFS, we present the metric \u0000<inline-formula><tex-math>$MIC_{Cor}$</tex-math></inline-formula>\u0000 that can directly discard low correlation features. Finally, extensive experimental results indicate the effectiveness of OHSFS. Moreover, OHSFS is nonparametric and does not need to know the feature type before learning, which aligns with practical application needs.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 4","pages":"470-485"},"PeriodicalIF":7.5,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141602573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced Multi-Scale Features Mutual Mapping Fusion Based on Reverse Knowledge Distillation for Industrial Anomaly Detection and Localization 基于反向知识提炼的增强型多尺度特征相互映射融合技术,用于工业异常检测和定位
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-01-08 DOI: 10.1109/TBDATA.2024.3350539
Guoxiang Tong;Quanquan Li;Yan Song
{"title":"Enhanced Multi-Scale Features Mutual Mapping Fusion Based on Reverse Knowledge Distillation for Industrial Anomaly Detection and Localization","authors":"Guoxiang Tong;Quanquan Li;Yan Song","doi":"10.1109/TBDATA.2024.3350539","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3350539","url":null,"abstract":"Unsupervised anomaly detection methods based on knowledge distillation have exhibited promising results. However, there is still room for improvement in the differential characterization of anomalous samples. In this article, a novel anomaly detection and localization model based on reverse knowledge distillation is proposed, where an enhanced multi-scale feature mutual mapping feature fusion module is proposed to greatly extract discrepant features at different scales. This module helps enhance the difference in anomaly region representation in the teacher-student structure by inhomogeneously fusing features at different levels. Then, the coordinate attention mechanism is introduced in the reverse distillation structure to pay special attention to dominant issues, facilitating nice direction guidance and position encoding. Furthermore, an innovative single-category embedding memory bank, inspired by human memory mechanisms, is developed to normalize single-category embedding to encourage high-quality model reconstruction. Finally, in several categories of the well-known MVTec dataset, our model achieves better results than state-of-the-art models in terms of AUROC and PRO, with an overall average of 98.1%, 98.3%, and 95.0% for detection AUROC scores, localization AUROC scores, and localization PRO scores, respectively, across 15 categories. Extensive experiments are conducted on the ablation study to validate the contribution of each component of the model.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 4","pages":"498-513"},"PeriodicalIF":7.5,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141602434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalable Unsupervised Hashing via Exploiting Robust Cross-Modal Consistency 通过利用稳健的跨模态一致性实现可扩展的无监督哈希算法
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-01-08 DOI: 10.1109/TBDATA.2024.3350541
Xingbo Liu;Jiamin Li;Xiushan Nie;Xuening Zhang;Shaohua Wang;Yilong Yin
{"title":"Scalable Unsupervised Hashing via Exploiting Robust Cross-Modal Consistency","authors":"Xingbo Liu;Jiamin Li;Xiushan Nie;Xuening Zhang;Shaohua Wang;Yilong Yin","doi":"10.1109/TBDATA.2024.3350541","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3350541","url":null,"abstract":"Unsupervised cross-modal hashing has received increasing attention because of its efficiency and scalability for large-scale data retrieval and analysis. However, existing unsupervised cross-modal hashing methods primarily focus on learning shared feature embedding, ignoring robustness and consistency across different modalities. To this end, this study proposes a novel method called scalable unsupervised hashing (SUH) for large-scale cross-modal retrieval. In the proposed method, latent semantic information and common semantic embedding within heterogeneous data are simultaneously exploited using multimodal clustering and collective matrix factorization, respectively. Furthermore, the robust norm is seamlessly integrated into the two processes, making SUH insensitive to outliers. Based on the robust consistency exploited from the latent semantic information and feature embedding, hash codes can be learned discretely to avoid cumulative quantitation loss. The experimental results on five benchmark datasets demonstrate the effectiveness of the proposed method under various scenarios.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 4","pages":"514-527"},"PeriodicalIF":7.5,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141602571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Few-Shot Learning With Multi-Granularity Knowledge Fusion and Decision-Making 利用多粒度知识融合和决策的 "少量学习"(Few-Shot Learning with Multi-Granularity Knowledge Fusion and Decision-Making
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2024-01-08 DOI: 10.1109/TBDATA.2024.3350542
Yuling Su;Hong Zhao;Yifeng Zheng;Yu Wang
{"title":"Few-Shot Learning With Multi-Granularity Knowledge Fusion and Decision-Making","authors":"Yuling Su;Hong Zhao;Yifeng Zheng;Yu Wang","doi":"10.1109/TBDATA.2024.3350542","DOIUrl":"https://doi.org/10.1109/TBDATA.2024.3350542","url":null,"abstract":"Few-shot learning (FSL) is a challenging task in classifying new classes from few labelled examples. Many existing models embed class structural knowledge as prior knowledge to enhance FSL against data scarcity. However, they fall short of connecting the class structural knowledge with the limited visual information which plays a decisive role in FSL model performance. In this paper, we propose a unified FSL framework with multi-granularity knowledge fusion and decision-making (MGKFD) to overcome the limitation. We aim to simultaneously explore the visual information and structural knowledge, working in a mutual way to enhance FSL. On the one hand, we strongly connect global and local visual information with multi-granularity class knowledge to explore intra-image and inter-class relationships, generating specific multi-granularity class representations with limited images. On the other hand, a weight fusion strategy is introduced to integrate multi-granularity knowledge and visual information to make the classification decision of FSL. It enables models to learn more effectively from limited labelled examples and allows generalization to new classes. Moreover, considering varying erroneous predictions, a hierarchical loss is established by structural knowledge to minimize the classification loss, where greater degree of misclassification is penalized more. Experimental results on three benchmark datasets show the advantages of MGKFD over several advanced models.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 4","pages":"486-497"},"PeriodicalIF":7.5,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141602552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SCOREH+: A High-Order Node Proximity Spectral Clustering on Ratios-of-Eigenvectors Algorithm for Community Detection SCOREH+:用于群落检测的基于特征向量比的高阶节点邻近度谱聚类算法
IF 7.2 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2023-12-25 DOI: 10.1109/TBDATA.2023.3346715
Yanhui Zhu;Fang Hu;Lei Hsin Kuo;Jia Liu
{"title":"SCOREH+: A High-Order Node Proximity Spectral Clustering on Ratios-of-Eigenvectors Algorithm for Community Detection","authors":"Yanhui Zhu;Fang Hu;Lei Hsin Kuo;Jia Liu","doi":"10.1109/TBDATA.2023.3346715","DOIUrl":"https://doi.org/10.1109/TBDATA.2023.3346715","url":null,"abstract":"The research on complex networks has achieved significant progress in revealing the mesoscopic features of networks. Community detection is an important aspect of understanding real-world complex systems. We present in this paper a High-order node proximity Spectral Clustering on Ratios-of-Eigenvectors (SCOREH+) algorithm for locating communities in complex networks. The algorithm improves SCORE and SCORE+ and preserves high-order transitivity information of the network affinity matrix. We optimize the high-order proximity matrix from the initial affinity matrix using the Radial Basis Functions (RBFs) and Katz index. In addition to the optimization of the Laplacian matrix, we implement a procedure that joins an additional eigenvector (the \u0000<inline-formula><tex-math>$(k+1){rm th}$</tex-math></inline-formula>\u0000 leading eigenvector) to the spectrum domain for clustering if the network is considered to be a “weak signal” graph. The algorithm has been successfully applied to both real-world and synthetic data sets. The proposed algorithm is compared with state-of-art algorithms, such as ASE, Louvain, Fast-Greedy, Spectral Clustering (SC), SCORE, and SCORE+. To demonstrate the high efficacy of the proposed method, we conducted comparison experiments on eleven real-world networks and a number of synthetic networks with noise. The experimental results in most of these networks demonstrate that SCOREH+ outperforms the baseline methods. Moreover, by tuning the RBFs and their shaping parameters, we may generate state-of-the-art community structures on all real-world networks and even on noisy synthetic networks.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 3","pages":"301-312"},"PeriodicalIF":7.2,"publicationDate":"2023-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140924735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Causal Chain Graph Structure via Alternate Learning and Double Pruning 通过交替学习和双修剪学习因果链图结构
IF 7.5 3区 计算机科学
IEEE Transactions on Big Data Pub Date : 2023-12-25 DOI: 10.1109/TBDATA.2023.3346712
Shujing Yang;Fuyuan Cao;Kui Yu;Jiye Liang
{"title":"Learning Causal Chain Graph Structure via Alternate Learning and Double Pruning","authors":"Shujing Yang;Fuyuan Cao;Kui Yu;Jiye Liang","doi":"10.1109/TBDATA.2023.3346712","DOIUrl":"https://doi.org/10.1109/TBDATA.2023.3346712","url":null,"abstract":"Causal chain graphs model the dependency structure between individuals when the assumption of individual independence in causal inference is violated. However, causal chain graphs are often unknown in practice and require learning from data. Existing learning algorithms have certain limitations. Specifically, learning local information requires multiple subset searches, building the skeleton requires additional conditional independence testing, and directing the edges requires obtaining local information from the skeleton again. To remedy these problems, we propose a novel algorithm for learning causal chain graph structure. The algorithm alternately learns the adjacencies and spouses of each variable as local information and doubly prunes them to obtain more accurate local information, which reduces subset searches, improves its accuracy, and facilitates subsequent learning. It then directly constructs the chain graphs skeleton using the learned adjacencies without conditional independence testing. Finally, it directs the edges of complexes using the learned adjacencies and spouses to learn chain graphs without reacquiring local information, further improving its efficiency. We conduct theoretical analysis to prove the correctness of our algorithm and compare it with the state-of-the-art algorithms on synthetic and real-world datasets. The experimental results demonstrate our algorithm is more reliable than its rivals.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 4","pages":"442-456"},"PeriodicalIF":7.5,"publicationDate":"2023-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141602541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信