2013 IEEE 13th International Conference on Data Mining最新文献_第10页

Transfer Learning across Networks for Collective Classification 集体分类跨网络迁移学习

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.116

Meng Fang, Jie Yin, Xingquan Zhu

{"title":"Transfer Learning across Networks for Collective Classification","authors":"Meng Fang, Jie Yin, Xingquan Zhu","doi":"10.1109/ICDM.2013.116","DOIUrl":"https://doi.org/10.1109/ICDM.2013.116","url":null,"abstract":"This paper addresses the problem of transferring useful knowledge from a source network to predict node labels in a newly formed target network. While existing transfer learning research has primarily focused on vector-based data, in which the instances are assumed to be independent and identically distributed, how to effectively transfer knowledge across different information networks has not been well studied, mainly because networks may have their distinct node features and link relationships between nodes. In this paper, we propose a new transfer learning algorithm that attempts to transfer common latent structure features across the source and target networks. The proposed algorithm discovers these latent features by constructing label propagation matrices in the source and target networks, and mapping them into a shared latent feature space. The latent features capture common structure patterns shared by two networks, and serve as domain-independent features to be transferred between networks. Together with domain-dependent node features, we thereafter propose an iterative classification algorithm that leverages label correlations to predict node labels in the target network. Experiments on real-world networks demonstrate that our proposed algorithm can successfully achieve knowledge transfer between networks to help improve the accuracy of classifying nodes in the target network.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125356650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 47

Influence Maximization in Dynamic Social Networks 动态社会网络中的影响力最大化

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.145

Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun

引用次数: 145

Learning Imbalanced Multi-class Data with Optimal Dichotomy Weights

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.51

Xu-Ying Liu, Qian-Qian Li, Zhi-Hua Zhou

{"title":"Learning Imbalanced Multi-class Data with Optimal Dichotomy Weights","authors":"Xu-Ying Liu, Qian-Qian Li, Zhi-Hua Zhou","doi":"10.1109/ICDM.2013.51","DOIUrl":"https://doi.org/10.1109/ICDM.2013.51","url":null,"abstract":"Class-imbalance is very common in real data mining tasks. Previous studies focused on binary-class imbalance problem, whereas multi-class imbalance problem is more challenging. Error correcting output codes (ECOC) technique can be applied to class-imbalance problem, however, the standard ECOC aims at maximizing accuracy, ignoring the fact that, when class-imbalance is really a problem, the minority classes are more important than the majority classes. To enable ECOC to tackle multi-class imbalance, it is desired to have an appropriate code matrix, an effective learning strategy and a decoding strategy emphasizing the minority classes. In this paper, based on the aforementioned consideration, we propose the imECOC method which works on dichotomies to handle both the between-class imbalance and within-class imbalance. As the dichotomy classifiers contribute differently to the final prediction, imECOC assigns weights to dichotomies and uses weighted distance for decoding, where the optimal dichotomy weights are obtained by minimizing a weighted loss in favor of the minority classes. Experimental results on fourteen data sets show that, imECOC performs significantly better than many state-of-the-art multi-class imbalance learning methods, no matter whether multi-class F1, G-mean or AUC are used as evaluation measures.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122002238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 49

Accelerating Active Learning with Transfer Learning

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.160

David C. Kale, Yan Liu

引用次数: 31

Mining Evolving Network Processes 挖掘演化网络过程

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.106

M. Mongiovì, Petko Bogdanov, Ambuj K. Singh

{"title":"Mining Evolving Network Processes","authors":"M. Mongiovì, Petko Bogdanov, Ambuj K. Singh","doi":"10.1109/ICDM.2013.106","DOIUrl":"https://doi.org/10.1109/ICDM.2013.106","url":null,"abstract":"Processes within real world networks evolve according to the underlying graph structure. A number of examples exists in diverse network genres: botnet communication growth, moving traffic jams [1], information foraging [2] in document networks (WWW and Wikipedia), and spread of viral memes or opinions in social networks. The network structure in all the above examples remains relatively fixed, while the shape, size and position of the affected network regions change gradually with time. Traffic jams grow, move, shrink and eventually disappear. Public attention shifts among current hot topics inducing a similar shift of highly accessed Wikipedia articles. Discovery of such smoothly evolving network processes has the potential to expose the intrinsic mechanisms of complex network dynamics, enable new data-driven models and improve network design. We introduce the novel problem of Mining smoothly evolving processes (MINESMOOTH) in networks with dynamic real-valued node/edge weights. We show that ensuring smooth transitions in the solution is NP-hard even on restricted network structures such as trees. We propose an efficient filtering based framework, called LEGATO. It achieves 3-7 times higher scores (i.e. larger and more significant processes) compared to alternatives on real networks, and above 80% accuracy in discovering realistic \"embedded\" processes in synthetic networks. In transportation networks, LEGATO discovers processes that conform to existing traffic jams models. Its results in Wikipedia reveal the temporal evolution of information seeking of Internet users.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129643102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

Dynamic Pattern Detection with Temporal Consistency and Connectivity Constraints 具有时间一致性和连通性约束的动态模式检测

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.66

S. Speakman, Yating Zhang, Daniel B. Neill

{"title":"Dynamic Pattern Detection with Temporal Consistency and Connectivity Constraints","authors":"S. Speakman, Yating Zhang, Daniel B. Neill","doi":"10.1109/ICDM.2013.66","DOIUrl":"https://doi.org/10.1109/ICDM.2013.66","url":null,"abstract":"We explore scalable and accurate dynamic pattern detection methods in graph-based data sets. We apply our proposed Dynamic Subset Scan method to the task of detecting, tracking, and source-tracing contaminant plumes spreading through a water distribution system equipped with noisy, binary sensors. While static patterns affect the same subset of data over a period of time, dynamic patterns may affect different subsets of the data at each time step. These dynamic patterns require a new approach to define and optimize penalized likelihood ratio statistics in the subset scan framework, as well as new computational techniques that scale to large, real-world networks. To address the first concern, we develop new subset scan methods that allow the detected subset of nodes to change over time, while incorporating temporal consistency constraints to reward patterns that do not dramatically change between adjacent time steps. Second, our Additive Graph Scan algorithm allows our novel scan statistic to process small graphs (500 nodes) in 4.1 seconds on average while maintaining an approximation ratio over 99% compared to an exact optimization method, and to scale to large graphs with over 12,000 nodes in 30 minutes on average. Evaluation results across multiple detection, tracking, and source-tracing tasks demonstrate substantial performance gains achieved by the Dynamic Subset Scan approach.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132973587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Active Query Driven by Uncertainty and Diversity for Incremental Multi-label Learning 基于不确定性和多样性的增量多标签学习主动查询

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.74

Sheng-Jun Huang, Zhi-Hua Zhou

引用次数: 66

Multiclass Semi-Supervised Boosting Using Similarity Learning 基于相似学习的多类半监督Boosting

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.108

J. Tanha, M. Saberian, M. Someren

引用次数: 6

Network Hypothesis Testing Using Mixed Kronecker Product Graph Models 混合Kronecker积图模型的网络假设检验

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.165

Sebastián Moreno, Jennifer Neville

{"title":"Network Hypothesis Testing Using Mixed Kronecker Product Graph Models","authors":"Sebastián Moreno, Jennifer Neville","doi":"10.1109/ICDM.2013.165","DOIUrl":"https://doi.org/10.1109/ICDM.2013.165","url":null,"abstract":"The recent interest in networks-social, physical, communication, information, etc.-has fueled a great deal of research on the analysis and modeling of graphs. However, many of the analyses have focused on a single large network (e.g., a sub network sampled from Facebook). Although several studies have compared networks from different domains or samples, they largely focus on empirical exploration of network similarities rather than explicit tests of hypotheses. This is in part due to a lack of statistical methods to determine whether two large networks are likely to have been drawn from the same underlying graph distribution. Research on across-network hypothesis testing methods has been limited by (i) difficulties associated with obtaining a set of networks to reason about the underlying graph distribution, and (ii) limitations of current statistical models of graphs that make it difficult to represent variations across networks. In this paper, we exploit the recent development of mixed-Kronecker Product Graph Models, which accurately capture the natural variation in real world graphs, to develop a model-based approach for hypothesis testing in networks.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"72 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131490428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 39

Adaptive Model Tree for Streaming Data 流数据的自适应模型树

2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.46

A. M. Zimmer, Michael Kurze, T. Seidl

引用次数: 3