ACM Transactions on Knowledge Discovery from Data最新文献_第2页

Imbalance-Robust Multi-Label Self-Adjusting kNN 失衡-稳健多标签自调整 kNN

IF 3.6 3区计算机科学

ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-05-11 DOI: 10.1145/3663575

Victor Gomes de Oliveira Martins Nicola, Karina Valdivia Delgado, Marcelo de Souza Lauretto

{"title":"Imbalance-Robust Multi-Label Self-Adjusting kNN","authors":"Victor Gomes de Oliveira Martins Nicola, Karina Valdivia Delgado, Marcelo de Souza Lauretto","doi":"10.1145/3663575","DOIUrl":"https://doi.org/10.1145/3663575","url":null,"abstract":"In the task of multi-label classification in data streams, instances arriving in real time need to be associated with multiple labels simultaneously. Various methods based on the k Nearest Neighbors algorithm have been proposed to address this task. However, these methods face limitations when dealing with imbalanced data streams, a problem that has received limited attention in existing works. To approach this gap, this paper introduces the Imbalance-Robust Multi-Label Self-Adjusting kNN (IRMLSAkNN), designed to tackle multi-label imbalanced data streams. IRMLSAkNN’s strength relies on maintaining relevant instances with imbalance labels by using a discarding mechanism that considers the imbalance ratio per label. On the other hand, it evaluates subwindows with an imbalance-aware measure to discard older instances that are lacking performance. We conducted statistical experiments on 32 benchmark data streams, evaluating IRMLSAkNN against eight multi-label classification algorithms using common accuracy-aware and imbalance-aware measures. The obtained results demonstrate that IRMLSAkNN consistently outperforms these algorithms in terms of predictive capacity and time cost across various levels of imbalance.","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"65 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140929310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mobile User Traffic Generation via Multi-Scale Hierarchical GAN 通过多尺度分层 GAN 生成移动用户流量

IF 3.6 3区计算机科学

ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-05-10 DOI: 10.1145/3664655

Tong Li, Shuodi Hui, Shiyuan Zhang, Huandong Wang, Yuheng Zhang, Pan Hui, Depeng Jin, Yong Li

{"title":"Mobile User Traffic Generation via Multi-Scale Hierarchical GAN","authors":"Tong Li, Shuodi Hui, Shiyuan Zhang, Huandong Wang, Yuheng Zhang, Pan Hui, Depeng Jin, Yong Li","doi":"10.1145/3664655","DOIUrl":"https://doi.org/10.1145/3664655","url":null,"abstract":"Mobile user traffic facilitates diverse applications, including network planning and optimization, whereas large-scale mobile user traffic is hardly available due to privacy concerns. One alternative solution is to generate mobile user traffic data for downstream applications. However, existing generation models cannot simulate the multi-scale temporal dynamics in mobile user traffic on individual and aggregate levels. In this work, we propose a multi-scale hierarchical generative adversarial network (MSH-GAN) containing multiple generators and a multi-class discriminator. Specifically, the mobile traffic usage behavior exhibits a mixture of multiple behavior patterns, which are called micro-scale behavior patterns and are modeled by different pattern generators in our model. Moreover, the traffic usage behavior of different users exhibits strong clustering characteristics, with the co-existence of users with similar and different traffic usage behaviors. Thus, we model each cluster of users as a class in the discriminator’s output, referred to as macro-scale user clusters. Then, the gap between micro-scale behavior patterns and macro-scale user clusters is bridged by introducing the switch mode generators, which describe the traffic usage behavior in switching between different patterns. All users share the pattern generators. In contrast, the switch mode generators are only shared by a specific cluster of users, which models the multi-scale hierarchical structure of the traffic usage behavior of massive users. Finally, we urge MSH-GAN to learn the multi-scale temporal dynamics via a combined loss function, including adversarial loss, clustering loss, aggregated loss, and regularity terms. Extensive experiment results demonstrate that MSH-GAN outperforms state-of-art baselines by at least 118.17% in critical data fidelity and usability metrics. Moreover, observations show that MSH-GAN can simulate traffic patterns and pattern switch behaviors.","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"17 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140942113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Data Completion-guided Unified Graph Learning for Incomplete Multi-View Clustering 针对不完整多视图聚类的数据完成指导的统一图学习

IF 3.6 3区计算机科学

ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-05-09 DOI: 10.1145/3664290

Tianhai Liang, Qiangqiang Shen, Shuqin Wang, Yongyong Chen, Guokai Zhang, Junxin Chen

{"title":"Data Completion-guided Unified Graph Learning for Incomplete Multi-View Clustering","authors":"Tianhai Liang, Qiangqiang Shen, Shuqin Wang, Yongyong Chen, Guokai Zhang, Junxin Chen","doi":"10.1145/3664290","DOIUrl":"https://doi.org/10.1145/3664290","url":null,"abstract":"Due to its heterogeneous property, multi-view data has been widely concerned over single-view data for performance improvement. Unfortunately, some instances may be with partially available information because of some uncontrollable factors, for which the incomplete multi-view clustering (IMVC) problem is raised. IMVC aims to partition unlabeled incomplete multi-view data into their clusters by exploiting the heterogeneity of multi-view data and overcoming the difficulty of data loss. However, most existing IMVC methods like BSV, MIC, OMVC, and IVC tend to conduct basic completion processing on the input data, without taking advantage of the correlation between samples and information redundancy. To overcome the above issue, we propose one novel IMVC method named Data Completion-guided Unified Graph Learning (DCUGL), which could complete the data of missing views and fuse multiple learned view-specific similarity matrices into one unified graph. Specifically, we first reduce the dimension of the input data to learn multiple view-specific similarity matrices. By stacking all view-specific similarity matrices, DCUGL constructs a third-order tensor with the low-rank constraint, such that sample correlation within and between views can be well explored. Finally, by dividing the original data into observed data and unobserved data, DCUGL can infer and complete the missing data according to the view-specific similarity matrices, and obtain a unified graph, which can be directly used for clustering. To solve the proposed model, we design an iterative algorithm, which is based on the alternating direction method of multipliers (ADMM) framework. The proposed model proves to be superior by benchmarking on six challenging datasets compared with state-of-the-art IMVC methods.","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"4 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140928994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FastHGNN: A New Sampling Technique for Learning with Hypergraph Neural Networks FastHGNN：超图神经网络学习的新取样技术

IF 3.6 3区计算机科学

ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-05-09 DOI: 10.1145/3663670

Fengcheng Lu, Michael Kwok-Po Ng

引用次数: 0

Learning with Asynchronous Labels 使用异步标签学习

IF 3.6 3区计算机科学

ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-05-03 DOI: 10.1145/3662186

Yu-Yang Qian, Zhen-Yu Zhang, Peng Zhao, Zhi-Hua Zhou

引用次数: 0

Variate Associated Domain Adaptation for Unsupervised Multivariate Time Series Anomaly Detection 用于无监督多变量时间序列异常检测的变异相关领域适应技术

IF 3.6 3区计算机科学

ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-05-03 DOI: 10.1145/3663573

Yifan He, Yatao Bian, Xi Ding, Bingzhe Wu, Jihong Guan, Ji Zhang, Shuigeng Zhou

{"title":"Variate Associated Domain Adaptation for Unsupervised Multivariate Time Series Anomaly Detection","authors":"Yifan He, Yatao Bian, Xi Ding, Bingzhe Wu, Jihong Guan, Ji Zhang, Shuigeng Zhou","doi":"10.1145/3663573","DOIUrl":"https://doi.org/10.1145/3663573","url":null,"abstract":"Multivariate Time Series Anomaly Detection (MTS-AD) is crucial for the effective management and maintenance of devices in complex systems such as server clusters, spacecrafts and financial systems etc. However, upgrade or cross-platform deployment of these devices will introduce the issue of cross-domain distribution shift, which leads to the prototypical problem of Domain Adaptation for MTS-AD. Compared with general domain adaptation problems, MTS-AD domain adaptation presents two peculiar challenges: 1) The dimensions of data from the source domain and the target domain are usually different, so alignment without losing any information is necessary. 2) The association between different variates plays a vital role in the MTS-AD task, which is overlooked by traditional domain adaptation approaches. Aiming at addressing the above issues, we propose a Variate Associated domaiN aDaptation method combined with a GrAph Deviation Network (abbreviated as <monospace>VANDA</monospace>) for MTS-AD, which includes two major contributions. First, we characterize the intra-domain variate associations of the source domain by a graph deviation network (GDN), which can share parameters across domains without dimension alignment. Second, we propose a sliding similarity to measure the inter-domain variate associations and perform joint training by minimizing the optimal transport distance between source and target data for transferring variate associations across domains. <monospace>VANDA</monospace> achieves domain adaptation by transferring both variate associations and GDN parameters from the source domain to the target domain. We construct two pairs of MTS-AD datasets from existing MTS-AD data and combine three domain adaptation strategies with six MTS-AD backbones as the benchmark methods for experimental evaluation and comparison. Extensive experiments demonstrate the effectiveness of our approach, which outperforms the benchmark methods, and significantly improves the AD performance of the target domain by effectively utilizing the source domain knowledge.","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"247 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140834285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving Graph Collaborative Filtering with Directional Behavior Enhanced Contrastive Learning 用定向行为增强对比学习改进图协同过滤

IF 3.6 3区计算机科学

ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-05-02 DOI: 10.1145/3663574

Penghang Yu, Bing-Kun Bao, Zhiyi Tan, Guanming Lu

{"title":"Improving Graph Collaborative Filtering with Directional Behavior Enhanced Contrastive Learning","authors":"Penghang Yu, Bing-Kun Bao, Zhiyi Tan, Guanming Lu","doi":"10.1145/3663574","DOIUrl":"https://doi.org/10.1145/3663574","url":null,"abstract":"Graph Collaborative Filtering is a widely adopted approach for recommendation, which captures similar behavior features through graph neural network. Recently, Contrastive Learning (CL) has been demonstrated as an effective method to enhance the performance of graph collaborative filtering. Typically, CL-based methods first perturb users’ history behavior data (e.g., drop clicked items), then construct a self-discriminating task for behavior representations under different random perturbations. However, for widely existing inactive users, random perturbation makes their sparse behavior information more incomplete, thereby harming the behavior feature extraction.To tackle the above issue, we design a novel directional perturbation-based CL method to improve the graph collaborative filtering performance. The idea is to perturb node representations through directionally enhancing behavior features. To do so, we propose a simple yet effective feedback mechanism, which fuses the representations of nodes based on behavior similarity. Then, to avoid irrelevant behavior preferences introduced by the feedback mechanism, we construct a behavior self-contrast task before and after feedback, to align the node representations between the final output and the first layer of GNN. Different from the widely-adopted self-discriminating task, the behavior self-contrast task avoids complex message propagation on different perturbed graphs, which is more efficient than previous methods. Extensive experiments on three public datasets demonstrate that the proposed method has distinct advantages over other contrastive learning methods on recommendation accuracy.","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"20 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140842142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

EML: Emotion-Aware Meta Learning for Cross-Event False Information Detection EML：用于跨事件虚假信息检测的情感感知元学习

IF 3.6 3区计算机科学

ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-05-02 DOI: 10.1145/3661485

Yinqiu Huang, Min Gao, Kai Shu, Chenghua Lin, Jia Wang, Wei Zhou

{"title":"EML: Emotion-Aware Meta Learning for Cross-Event False Information Detection","authors":"Yinqiu Huang, Min Gao, Kai Shu, Chenghua Lin, Jia Wang, Wei Zhou","doi":"10.1145/3661485","DOIUrl":"https://doi.org/10.1145/3661485","url":null,"abstract":"Modern social media’s development has dramatically changed how people obtain information. However, the wide dissemination of various false information has severely detrimental effects. Accordingly, many deep learning-based methods have been proposed to detect false information and achieve promising results. However, these methods are unsuitable for new events due to the extremely limited labeled data and their discrepant data distribution to existing events. Domain adaptation methods have been proposed to mitigate these problems. However, their performance is suboptimal because they are not sensitive to new events due to they aim to align the domain information between existing events, and they hardly capture the fine-grained difference between real and fake claims by only using semantic information. Therefore, we propose a novel Emotion-aware Meta Learning (EML) approach for cross-event false information early detection, which deeply integrates emotions in meta learning to find event-sensitive initialization parameters that quickly adapt to new events. Emotion-aware meta learning is non-trivial and faces three challenges: 1) How to effectively model semantic and emotional features to capture fine-grained differences. 2) How to reduce the impact of noise in meta learning based on semantic and emotional features. 3) How to detect the false information in a zero-shot detection scenario, i.e., no labeled data for new events. To tackle these challenges, firstly, we construct the emotion-aware meta tasks by selecting claims with similar and opposite emotions to the target claim other than usually used random sampling. Secondly, we propose a task weighting method and event-adaptation meta tasks to further improve the model’s robustness and generalization ability for detecting new events. Finally, we propose a weak label annotation method to extend EML to zero-shot detection according to the calculated labels’ confidence. Extensive experiments on real-world datasets show that the EML achieves superior performances on false information detection for new events.","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"38 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140834283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Distributed Pseudo-Likelihood Method for Community Detection in Large-Scale Networks 用于大规模网络中社群检测的分布式伪似然法

IF 3.6 3区计算机科学

ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-04-16 DOI: 10.1145/3657300

Jiayi Deng, Danyang Huang, Bo Zhang

{"title":"Distributed Pseudo-Likelihood Method for Community Detection in Large-Scale Networks","authors":"Jiayi Deng, Danyang Huang, Bo Zhang","doi":"10.1145/3657300","DOIUrl":"https://doi.org/10.1145/3657300","url":null,"abstract":"This paper proposes a distributed pseudo-likelihood method (DPL) to conveniently identify the community structure of large-scale networks. Specifically, we first propose a block-wise splitting method to divide large-scale network data into several subnetworks and distribute them among multiple workers. For simplicity, we assume the classical stochastic block model. Then, the DPL algorithm is iteratively implemented for the distributed optimization of the sum of the local pseudo-likelihood functions. At each iteration, the worker updates its local community labels and communicates with the master. The master then broadcasts the combined estimator to each worker for the new iterative steps. Based on the distributed system, DPL significantly reduces the computational complexity of the traditional pseudo-likelihood method using a single machine. Furthermore, to ensure statistical accuracy, we theoretically discuss the requirements of the worker sample size. Moreover, we extend the DPL method to estimate degree-corrected stochastic block models. The superior performance of the proposed distributed algorithm is demonstrated through extensive numerical studies and real data analysis.","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"5 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140591250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Survey of Trustworthy Representation Learning Across Domains 跨领域可信表征学习调查

IF 3.6 3区计算机科学

ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-04-12 DOI: 10.1145/3657301

Ronghang Zhu, Dongliang Guo, Daiqing Qi, Zhixuan Chu, Xiang Yu, Sheng Li

{"title":"A Survey of Trustworthy Representation Learning Across Domains","authors":"Ronghang Zhu, Dongliang Guo, Daiqing Qi, Zhixuan Chu, Xiang Yu, Sheng Li","doi":"10.1145/3657301","DOIUrl":"https://doi.org/10.1145/3657301","url":null,"abstract":"As AI systems have obtained significant performance to be deployed widely in our daily live and human society, people both enjoy the benefits brought by these technologies and suffer many social issues induced by these systems. To make AI systems good enough and trustworthy, plenty of researches have been done to build guidelines for trustworthy AI systems. Machine learning is one of the most important parts for AI systems and representation learning is the fundamental technology in machine learning. How to make the representation learning trustworthy in real-world application, e.g., cross domain scenarios, is very valuable and necessary for both machine learning and AI system fields. Inspired by the concepts in trustworthy AI, we proposed the first trustworthy representation learning across domains framework which includes four concepts, i.e, robustness, privacy, fairness, and explainability, to give a comprehensive literature review on this research direction. Specifically, we first introduce the details of the proposed trustworthy framework for representation learning across domains. Second, we provide basic notions and comprehensively summarize existing methods for the trustworthy framework from four concepts. Finally, we conclude this survey with insights and discussions on future research directions.","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"58 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140591450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0