2013 IEEE 13th International Conference on Data Mining最新文献

筛选
英文 中文
Transfer Learning across Networks for Collective Classification 集体分类跨网络迁移学习
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.116
Meng Fang, Jie Yin, Xingquan Zhu
{"title":"Transfer Learning across Networks for Collective Classification","authors":"Meng Fang, Jie Yin, Xingquan Zhu","doi":"10.1109/ICDM.2013.116","DOIUrl":"https://doi.org/10.1109/ICDM.2013.116","url":null,"abstract":"This paper addresses the problem of transferring useful knowledge from a source network to predict node labels in a newly formed target network. While existing transfer learning research has primarily focused on vector-based data, in which the instances are assumed to be independent and identically distributed, how to effectively transfer knowledge across different information networks has not been well studied, mainly because networks may have their distinct node features and link relationships between nodes. In this paper, we propose a new transfer learning algorithm that attempts to transfer common latent structure features across the source and target networks. The proposed algorithm discovers these latent features by constructing label propagation matrices in the source and target networks, and mapping them into a shared latent feature space. The latent features capture common structure patterns shared by two networks, and serve as domain-independent features to be transferred between networks. Together with domain-dependent node features, we thereafter propose an iterative classification algorithm that leverages label correlations to predict node labels in the target network. Experiments on real-world networks demonstrate that our proposed algorithm can successfully achieve knowledge transfer between networks to help improve the accuracy of classifying nodes in the target network.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125356650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Influence Maximization in Dynamic Social Networks 动态社会网络中的影响力最大化
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.145
Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun
{"title":"Influence Maximization in Dynamic Social Networks","authors":"Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun","doi":"10.1109/ICDM.2013.145","DOIUrl":"https://doi.org/10.1109/ICDM.2013.145","url":null,"abstract":"Social influence and influence diffusion has been widely studied in online social networks. However, most existing works on influence diffusion focus on static networks. In this paper, we study the problem of maximizing influence diffusion in a dynamic social network. Specifically, the network changes over time and the changes can be only observed by periodically probing some nodes for the update of their connections. Our goal then is to probe a subset of nodes in a social network so that the actual influence diffusion process in the network can be best uncovered with the probing nodes. We propose a novel algorithm to approximate the optimal solution. The algorithm, through probing a small portion of the network, minimizes the possible error between the observed network and the real network. We evaluate the proposed algorithm on both synthetic and real large networks. Experimental results show that our proposed algorithm achieves a better performance than several alternative algorithms.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121588345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 145
Learning Imbalanced Multi-class Data with Optimal Dichotomy Weights
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.51
Xu-Ying Liu, Qian-Qian Li, Zhi-Hua Zhou
{"title":"Learning Imbalanced Multi-class Data with Optimal Dichotomy Weights","authors":"Xu-Ying Liu, Qian-Qian Li, Zhi-Hua Zhou","doi":"10.1109/ICDM.2013.51","DOIUrl":"https://doi.org/10.1109/ICDM.2013.51","url":null,"abstract":"Class-imbalance is very common in real data mining tasks. Previous studies focused on binary-class imbalance problem, whereas multi-class imbalance problem is more challenging. Error correcting output codes (ECOC) technique can be applied to class-imbalance problem, however, the standard ECOC aims at maximizing accuracy, ignoring the fact that, when class-imbalance is really a problem, the minority classes are more important than the majority classes. To enable ECOC to tackle multi-class imbalance, it is desired to have an appropriate code matrix, an effective learning strategy and a decoding strategy emphasizing the minority classes. In this paper, based on the aforementioned consideration, we propose the imECOC method which works on dichotomies to handle both the between-class imbalance and within-class imbalance. As the dichotomy classifiers contribute differently to the final prediction, imECOC assigns weights to dichotomies and uses weighted distance for decoding, where the optimal dichotomy weights are obtained by minimizing a weighted loss in favor of the minority classes. Experimental results on fourteen data sets show that, imECOC performs significantly better than many state-of-the-art multi-class imbalance learning methods, no matter whether multi-class F1, G-mean or AUC are used as evaluation measures.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122002238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
Accelerating Active Learning with Transfer Learning
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.160
David C. Kale, Yan Liu
{"title":"Accelerating Active Learning with Transfer Learning","authors":"David C. Kale, Yan Liu","doi":"10.1109/ICDM.2013.160","DOIUrl":"https://doi.org/10.1109/ICDM.2013.160","url":null,"abstract":"Active learning, transfer learning, and related techniques are unified by a core theme: efficient and effective use of available data. Active learning offers scalable solutions for building effective supervised learning models while minimizing annotation effort. Transfer learning utilizes existing labeled data from one task to help learning related tasks for which limited labeled data are available. There has been limited research, however, on how to combine these two techniques. In this paper, we present a simple and principled transfer active learning framework that leverages pre-existing labeled data from related tasks to improve the performance of an active learner. We derive an intuitive bound on generalization error for the classifiers learned by this algorithm that provides insight into the algorithm's behavior and the problem in general. Experimental results using several well-known transfer learning data sets confirm our theoretical analysis and demonstrate the effectiveness of our approach.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130645161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Mining Evolving Network Processes 挖掘演化网络过程
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.106
M. Mongiovì, Petko Bogdanov, Ambuj K. Singh
{"title":"Mining Evolving Network Processes","authors":"M. Mongiovì, Petko Bogdanov, Ambuj K. Singh","doi":"10.1109/ICDM.2013.106","DOIUrl":"https://doi.org/10.1109/ICDM.2013.106","url":null,"abstract":"Processes within real world networks evolve according to the underlying graph structure. A number of examples exists in diverse network genres: botnet communication growth, moving traffic jams [1], information foraging [2] in document networks (WWW and Wikipedia), and spread of viral memes or opinions in social networks. The network structure in all the above examples remains relatively fixed, while the shape, size and position of the affected network regions change gradually with time. Traffic jams grow, move, shrink and eventually disappear. Public attention shifts among current hot topics inducing a similar shift of highly accessed Wikipedia articles. Discovery of such smoothly evolving network processes has the potential to expose the intrinsic mechanisms of complex network dynamics, enable new data-driven models and improve network design. We introduce the novel problem of Mining smoothly evolving processes (MINESMOOTH) in networks with dynamic real-valued node/edge weights. We show that ensuring smooth transitions in the solution is NP-hard even on restricted network structures such as trees. We propose an efficient filtering based framework, called LEGATO. It achieves 3-7 times higher scores (i.e. larger and more significant processes) compared to alternatives on real networks, and above 80% accuracy in discovering realistic \"embedded\" processes in synthetic networks. In transportation networks, LEGATO discovers processes that conform to existing traffic jams models. Its results in Wikipedia reveal the temporal evolution of information seeking of Internet users.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129643102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Dynamic Pattern Detection with Temporal Consistency and Connectivity Constraints 具有时间一致性和连通性约束的动态模式检测
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.66
S. Speakman, Yating Zhang, Daniel B. Neill
{"title":"Dynamic Pattern Detection with Temporal Consistency and Connectivity Constraints","authors":"S. Speakman, Yating Zhang, Daniel B. Neill","doi":"10.1109/ICDM.2013.66","DOIUrl":"https://doi.org/10.1109/ICDM.2013.66","url":null,"abstract":"We explore scalable and accurate dynamic pattern detection methods in graph-based data sets. We apply our proposed Dynamic Subset Scan method to the task of detecting, tracking, and source-tracing contaminant plumes spreading through a water distribution system equipped with noisy, binary sensors. While static patterns affect the same subset of data over a period of time, dynamic patterns may affect different subsets of the data at each time step. These dynamic patterns require a new approach to define and optimize penalized likelihood ratio statistics in the subset scan framework, as well as new computational techniques that scale to large, real-world networks. To address the first concern, we develop new subset scan methods that allow the detected subset of nodes to change over time, while incorporating temporal consistency constraints to reward patterns that do not dramatically change between adjacent time steps. Second, our Additive Graph Scan algorithm allows our novel scan statistic to process small graphs (500 nodes) in 4.1 seconds on average while maintaining an approximation ratio over 99% compared to an exact optimization method, and to scale to large graphs with over 12,000 nodes in 30 minutes on average. Evaluation results across multiple detection, tracking, and source-tracing tasks demonstrate substantial performance gains achieved by the Dynamic Subset Scan approach.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132973587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Active Query Driven by Uncertainty and Diversity for Incremental Multi-label Learning 基于不确定性和多样性的增量多标签学习主动查询
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.74
Sheng-Jun Huang, Zhi-Hua Zhou
{"title":"Active Query Driven by Uncertainty and Diversity for Incremental Multi-label Learning","authors":"Sheng-Jun Huang, Zhi-Hua Zhou","doi":"10.1109/ICDM.2013.74","DOIUrl":"https://doi.org/10.1109/ICDM.2013.74","url":null,"abstract":"In multi-label learning, it is rather expensive to label instances since they are simultaneously associated with multiple labels. Therefore, active learning, which reduces the labeling cost by actively querying the labels of the most valuable data, becomes particularly important for multi-label learning. A strong multi-label active learning algorithm usually consists of two crucial elements: a reasonable criterion to evaluate the gain of queried label, and an effective classification model, based on whose prediction the criterion can be accurately computed. In this paper, we first introduce an effective multi-label classification model by combining label ranking with threshold learning, which is incrementally trained to avoid retraining from scratch after every query. Based on this model, we then propose to exploit both uncertainty and diversity in the instance space as well as the label space, and actively query the instance-label pairs which can improve the classification model most. Experimental results demonstrate the superiority of the proposed approach to state-of-the-art methods.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133450306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
Multiclass Semi-Supervised Boosting Using Similarity Learning 基于相似学习的多类半监督Boosting
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.108
J. Tanha, M. Saberian, M. Someren
{"title":"Multiclass Semi-Supervised Boosting Using Similarity Learning","authors":"J. Tanha, M. Saberian, M. Someren","doi":"10.1109/ICDM.2013.108","DOIUrl":"https://doi.org/10.1109/ICDM.2013.108","url":null,"abstract":"In this paper, we consider the multiclass semi-supervised classification problem. A boosting algorithm is proposed to solve the multiclass problem directly. The proposed multiclass approach uses a new multiclass loss function, which includes two terms. The first term is the cost of the multiclass margin and the second term is a regularization term on unlabeled data. The regularization term is used to minimize the inconsistency between the pair wise similarity and the classifier predictions. It assigns the soft labels weighted with the similarity between unlabeled and labeled examples. We then derive a boosting algorithm, named CD-MSSBoost, from the proposed loss function using coordinate gradient descent. The derived algorithm is further used for learning optimal similarity function for a given data. Our experiments on a number of UCI datasets show that CD-MSSBoost outperforms the state-of-the-art methods to multiclass semi-supervised learning.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131519293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Network Hypothesis Testing Using Mixed Kronecker Product Graph Models 混合Kronecker积图模型的网络假设检验
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.165
Sebastián Moreno, Jennifer Neville
{"title":"Network Hypothesis Testing Using Mixed Kronecker Product Graph Models","authors":"Sebastián Moreno, Jennifer Neville","doi":"10.1109/ICDM.2013.165","DOIUrl":"https://doi.org/10.1109/ICDM.2013.165","url":null,"abstract":"The recent interest in networks-social, physical, communication, information, etc.-has fueled a great deal of research on the analysis and modeling of graphs. However, many of the analyses have focused on a single large network (e.g., a sub network sampled from Facebook). Although several studies have compared networks from different domains or samples, they largely focus on empirical exploration of network similarities rather than explicit tests of hypotheses. This is in part due to a lack of statistical methods to determine whether two large networks are likely to have been drawn from the same underlying graph distribution. Research on across-network hypothesis testing methods has been limited by (i) difficulties associated with obtaining a set of networks to reason about the underlying graph distribution, and (ii) limitations of current statistical models of graphs that make it difficult to represent variations across networks. In this paper, we exploit the recent development of mixed-Kronecker Product Graph Models, which accurately capture the natural variation in real world graphs, to develop a model-based approach for hypothesis testing in networks.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"72 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131490428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Adaptive Model Tree for Streaming Data 流数据的自适应模型树
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.46
A. M. Zimmer, Michael Kurze, T. Seidl
{"title":"Adaptive Model Tree for Streaming Data","authors":"A. M. Zimmer, Michael Kurze, T. Seidl","doi":"10.1109/ICDM.2013.46","DOIUrl":"https://doi.org/10.1109/ICDM.2013.46","url":null,"abstract":"With an ever-growing availability of data streams the interest in and need for efficient techniques dealing with such data increases. A major challenge in this context is the accurate online prediction of continuous values in the presence of concept drift. In this paper, we introduce a new adaptive model tree (AMT), designed to incrementally learn from the data stream, adapt to the changes, and to perform real time accurate predictions at anytime. To deal with sub models lying in different subspaces, we propose a new model clustering algorithm able to identify subspace models, and use it for computing splits in the input space. Compared to state of the art, our AMT allows for oblique splits, delivering more compact and accurate models.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"104 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129971476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信