2011 IEEE 11th International Conference on Data Mining最新文献_第8页

Characterizing Inverse Time Dependency in Multi-class Learning 多类学习中逆时间依赖性的表征

2011 IEEE 11th International Conference on Data Mining Pub Date : 2011-12-11 DOI: 10.1109/ICDM.2011.32

Danqi Chen, Weizhu Chen, Qiang Yang

引用次数: 2

Nonnegative Matrix Tri-factorization Based High-Order Co-clustering and Its Fast Implementation 基于非负矩阵三因子分解的高阶共聚类及其快速实现

2011 IEEE 11th International Conference on Data Mining Pub Date : 2011-12-11 DOI: 10.1109/ICDM.2011.109

Hua Wang, F. Nie, Heng Huang, C. Ding

{"title":"Nonnegative Matrix Tri-factorization Based High-Order Co-clustering and Its Fast Implementation","authors":"Hua Wang, F. Nie, Heng Huang, C. Ding","doi":"10.1109/ICDM.2011.109","DOIUrl":"https://doi.org/10.1109/ICDM.2011.109","url":null,"abstract":"The fast growth of Internet and modern technologies has brought data involving objects of multiple types that are related to each other, called as Multi-Type Relational data. Traditional clustering methods for single-type data rarely work well on them, which calls for new clustering techniques, called as high-order co-clustering (HOCC), to deal with the multiple types of data at the same time. A major challenge in developing HOCC methods is how to effectively make use of all available information contained in a multi-type relational data set, including both inter-type and intra-type relationships. Meanwhile, because many real world data sets are often of large sizes, clustering methods with computationally efficient solution algorithms are of great practical interest. In this paper, we first present a general HOCC framework, named as Orthogonal Nonnegative Matrix Tri-factorization (O-NMTF), for simultaneous clustering of multi-type relational data. The proposed O-NMTF approach employs Nonnegative Matrix Tri-Factorization (NMTF) to simultaneously cluster different types of data using the inter-type relationships, and incorporate intra-type information through manifold regularization, where, different from existing works, we emphasize the importance of the orthogonal ties of the factor matrices of NMTF. Based on O-NMTF, we further develop a novel Fast Nonnegative Matrix Tri-Factorization (F-NMTF) approach to deal with large-scale data. Instead of constraining the factor matrices of NMTF to be nonnegative as in existing methods, F-NMTF constrains them to be cluster indicator matrices, a special type of nonnegative matrices. As a result, the optimization problem of the proposed method can be decoupled, which results in sub problems of much smaller sizes requiring much less matrix multiplications, such that our new algorithm scales well to real world data of large sizes. Extensive experimental evaluations have demonstrated the effectiveness of our new approaches.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114061627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 47

Helix: Unsupervised Grammar Induction for Structured Activity Recognition 结构化活动识别的无监督语法归纳

2011 IEEE 11th International Conference on Data Mining Pub Date : 2011-12-11 DOI: 10.1109/ICDM.2011.74

Huan-Kai Peng, Pang Wu, Jiang Zhu, J. Zhang

引用次数: 18

Mining Heavy Subgraphs in Time-Evolving Networks 时间演化网络中的重子图挖掘

2011 IEEE 11th International Conference on Data Mining Pub Date : 2011-12-11 DOI: 10.1109/ICDM.2011.101

Petko Bogdanov, M. Mongiovì, Ambuj K. Singh

{"title":"Mining Heavy Subgraphs in Time-Evolving Networks","authors":"Petko Bogdanov, M. Mongiovì, Ambuj K. Singh","doi":"10.1109/ICDM.2011.101","DOIUrl":"https://doi.org/10.1109/ICDM.2011.101","url":null,"abstract":"Networks from different genres are not static entities, but exhibit dynamic behavior. The congestion level of links in transportation networks varies in time depending on the traffic. Similarly, social and communication links are employed at varying rates as information cascades unfold. In recent years there has been an increase of interest in modeling and mining dynamic networks. However, limited attention has been placed in high-scoring sub graph discovery in time-evolving networks. We define the problem of finding the highest-scoring temporal sub graph in a dynamic network, termed Heaviest Dynamic Sub graph (HDS). We show that HDS is NP-hard even with edge weights in {-1,1} and devise an efficient approach for large graph instances that evolve over long time periods. While a naive approach would enumerate all O(t^2) sub-intervals, our solution performs an effective pruning of the sub-interval space by considering O(t*log(t)) groups of sub-intervals and computing an aggregate of each group in logarithmic time. We also define a fast heuristic and a tight upper bound for approximating the static version of HDS, and use them for further pruning the sub-interval space and quickly verifying candidate sub-intervals. We perform an extensive experimental evaluation of our algorithm on transportation, communication and social media networks for discovering sub graphs that correspond to traffic congestions, communication overflow and localized social discussions. Our method is two orders of magnitude faster than a naive approach and scales well with network size and time length.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"258 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116203964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 118

Tensor Fold-in Algorithms for Social Tagging Prediction 社会标签预测的张量折叠算法

2011 IEEE 11th International Conference on Data Mining Pub Date : 2011-12-11 DOI: 10.1109/ICDM.2011.142

Miao Zhang, C. Ding, Zhifang Liao

引用次数: 7

LPTA: A Probabilistic Model for Latent Periodic Topic Analysis LPTA:潜在周期主题分析的概率模型

2011 IEEE 11th International Conference on Data Mining Pub Date : 2011-12-11 DOI: 10.1109/ICDM.2011.96

Zhijun Yin, Liangliang Cao, Jiawei Han, ChengXiang Zhai, Thomas S. Huang

引用次数: 28

Discovering the Intrinsic Cardinality and Dimensionality of Time Series Using MDL 用MDL发现时间序列的固有基数和维数

2011 IEEE 11th International Conference on Data Mining Pub Date : 2011-12-11 DOI: 10.1007/978-3-642-44958-1_14

Bing Hu, T. Rakthanmanon, Yuan Hao, Scott Evans, S. Lonardi, Eamonn J. Keogh

引用次数: 61

Multi-task Learning for Bayesian Matrix Factorization 贝叶斯矩阵分解的多任务学习

2011 IEEE 11th International Conference on Data Mining Pub Date : 2011-12-11 DOI: 10.1109/ICDM.2011.107

Chao Yuan

引用次数: 9

Clustering with Attribute-Level Constraints 具有属性级约束的聚类

2011 IEEE 11th International Conference on Data Mining Pub Date : 2011-12-11 DOI: 10.1109/ICDM.2011.36

J. Schmidt, Elisabeth Maria Brändle, Stefan Kramer

引用次数: 18

LinkBoost: A Novel Cost-Sensitive Boosting Framework for Community-Level Network Link Prediction LinkBoost:一种用于社区级网络链路预测的新型成本敏感提升框架

2011 IEEE 11th International Conference on Data Mining Pub Date : 2011-12-11 DOI: 10.1109/ICDM.2011.93

Prakash Mandayam Comar, P. Tan, Anil K. Jain

{"title":"LinkBoost: A Novel Cost-Sensitive Boosting Framework for Community-Level Network Link Prediction","authors":"Prakash Mandayam Comar, P. Tan, Anil K. Jain","doi":"10.1109/ICDM.2011.93","DOIUrl":"https://doi.org/10.1109/ICDM.2011.93","url":null,"abstract":"Link prediction is a challenging task due to the inherent skew ness of network data. Typical link prediction methods can be categorized as either local or global. Local methods consider the link structure in the immediate neighborhood of a node pair to determine the presence or absence of a link, whereas global methods utilize information from the whole network. This paper presents a community (cluster) level link prediction method without the need to explicitly identify the communities in a network. Specifically, a variable-cost loss function is defined to address the data skew ness problem. We provide theoretical proof that shows the equivalence between maximizing the well-known modularity measure used in community detection and minimizing a special case of the proposed loss function. As a result, any link prediction method designed to optimize the loss function would result in more links being predicted within a community than between communities. We design a boosting algorithm to minimize the loss function and present an approach to scale-up the algorithm by decomposing the network into smaller partitions and aggregating the weak learners constructed from each partition. Experimental results show that our proposed Link Boost algorithm consistently performs as good as or better than many existing methods when evaluated on 4 real-world network datasets.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115803593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13