{"title":"Characterizing Inverse Time Dependency in Multi-class Learning","authors":"Danqi Chen, Weizhu Chen, Qiang Yang","doi":"10.1109/ICDM.2011.32","DOIUrl":"https://doi.org/10.1109/ICDM.2011.32","url":null,"abstract":"The training time of most learning algorithms increases as the size of training data increases. Yet, recent advances in linear binary SVM and LR challenge this commonsense by proposing an inverse dependency property, where the training time decreases as the size of training data increases. In this paper, we study the inverse dependency property of multi-class classification problem. We describe a general framework for multi-class classification problem with a single objective to achieve inverse dependency and extend it to three popular multi-class algorithms. We present theoretical results demonstrating its convergence and inverse dependency guarantee. We conduct experiments to empirically verify the inverse dependency of all the three algorithms on large-scale datasets as well as to ensure the accuracy.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131112283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonnegative Matrix Tri-factorization Based High-Order Co-clustering and Its Fast Implementation","authors":"Hua Wang, F. Nie, Heng Huang, C. Ding","doi":"10.1109/ICDM.2011.109","DOIUrl":"https://doi.org/10.1109/ICDM.2011.109","url":null,"abstract":"The fast growth of Internet and modern technologies has brought data involving objects of multiple types that are related to each other, called as Multi-Type Relational data. Traditional clustering methods for single-type data rarely work well on them, which calls for new clustering techniques, called as high-order co-clustering (HOCC), to deal with the multiple types of data at the same time. A major challenge in developing HOCC methods is how to effectively make use of all available information contained in a multi-type relational data set, including both inter-type and intra-type relationships. Meanwhile, because many real world data sets are often of large sizes, clustering methods with computationally efficient solution algorithms are of great practical interest. In this paper, we first present a general HOCC framework, named as Orthogonal Nonnegative Matrix Tri-factorization (O-NMTF), for simultaneous clustering of multi-type relational data. The proposed O-NMTF approach employs Nonnegative Matrix Tri-Factorization (NMTF) to simultaneously cluster different types of data using the inter-type relationships, and incorporate intra-type information through manifold regularization, where, different from existing works, we emphasize the importance of the orthogonal ties of the factor matrices of NMTF. Based on O-NMTF, we further develop a novel Fast Nonnegative Matrix Tri-Factorization (F-NMTF) approach to deal with large-scale data. Instead of constraining the factor matrices of NMTF to be nonnegative as in existing methods, F-NMTF constrains them to be cluster indicator matrices, a special type of nonnegative matrices. As a result, the optimization problem of the proposed method can be decoupled, which results in sub problems of much smaller sizes requiring much less matrix multiplications, such that our new algorithm scales well to real world data of large sizes. Extensive experimental evaluations have demonstrated the effectiveness of our new approaches.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114061627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Helix: Unsupervised Grammar Induction for Structured Activity Recognition","authors":"Huan-Kai Peng, Pang Wu, Jiang Zhu, J. Zhang","doi":"10.1109/ICDM.2011.74","DOIUrl":"https://doi.org/10.1109/ICDM.2011.74","url":null,"abstract":"The omnipresence of mobile sensors has brought tremendous opportunities to ubiquitous computing systems. In many natural settings, however, their broader applications are hindered by three main challenges: rarity of labels, uncertainty of activity granularities, and the difficulty of multi-dimensional sensor fusion. In this paper, we propose building a grammar to address all these challenges using a language-based approach. The proposed algorithm, called Helix, first generates an initial vocabulary using unlabeled sensor readings, followed by iteratively combining statistically collocated sub-activities across sensor dimensions and grouping similar activities together to discover higher level activities. The experiments using a 20-minute ping-pong game demonstrate favorable results compared to a Hierarchical Hidden Markov Model (HHMM) baseline. Closer investigations to the learned grammar also shows that the learned grammar captures the natural structure of the underlying activities.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114090014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining Heavy Subgraphs in Time-Evolving Networks","authors":"Petko Bogdanov, M. Mongiovì, Ambuj K. Singh","doi":"10.1109/ICDM.2011.101","DOIUrl":"https://doi.org/10.1109/ICDM.2011.101","url":null,"abstract":"Networks from different genres are not static entities, but exhibit dynamic behavior. The congestion level of links in transportation networks varies in time depending on the traffic. Similarly, social and communication links are employed at varying rates as information cascades unfold. In recent years there has been an increase of interest in modeling and mining dynamic networks. However, limited attention has been placed in high-scoring sub graph discovery in time-evolving networks. We define the problem of finding the highest-scoring temporal sub graph in a dynamic network, termed Heaviest Dynamic Sub graph (HDS). We show that HDS is NP-hard even with edge weights in {-1,1} and devise an efficient approach for large graph instances that evolve over long time periods. While a naive approach would enumerate all O(t^2) sub-intervals, our solution performs an effective pruning of the sub-interval space by considering O(t*log(t)) groups of sub-intervals and computing an aggregate of each group in logarithmic time. We also define a fast heuristic and a tight upper bound for approximating the static version of HDS, and use them for further pruning the sub-interval space and quickly verifying candidate sub-intervals. We perform an extensive experimental evaluation of our algorithm on transportation, communication and social media networks for discovering sub graphs that correspond to traffic congestions, communication overflow and localized social discussions. Our method is two orders of magnitude faster than a naive approach and scales well with network size and time length.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"258 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116203964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tensor Fold-in Algorithms for Social Tagging Prediction","authors":"Miao Zhang, C. Ding, Zhifang Liao","doi":"10.1109/ICDM.2011.142","DOIUrl":"https://doi.org/10.1109/ICDM.2011.142","url":null,"abstract":"Social tagging predictions involve the co occurrence of users, items and tags. The tremendous growth of users require the recommender system to produce tag recommendations for millions of users and items at any minute. The triplets of users, items and tags are most naturally described by a 3D tensor, and tensor decomposition-based algorithms can produce high quality recommendations. However, each day, thousands of new users are added to the system and the decompositions must be updated daily in a online fashion. In this paper, we provide analysis of the new user problem, and present fold-in algorithms for Tucker, Para Fac, and Low-order tensor decompositions. We show that these algorithm can very efficiently compute the needed decompositions. We evaluate the fold-in algorithms experimentally on several datasets and the results demonstrate the effectiveness of these algorithms.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128688503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhijun Yin, Liangliang Cao, Jiawei Han, ChengXiang Zhai, Thomas S. Huang
{"title":"LPTA: A Probabilistic Model for Latent Periodic Topic Analysis","authors":"Zhijun Yin, Liangliang Cao, Jiawei Han, ChengXiang Zhai, Thomas S. Huang","doi":"10.1109/ICDM.2011.96","DOIUrl":"https://doi.org/10.1109/ICDM.2011.96","url":null,"abstract":"This paper studies the problem of latent periodic topic analysis from time stamped documents. The examples of time stamped documents include news articles, sales records, financial reports, TV programs, and more recently, posts from social media websites such as Flickr, Twitter, and Face book. Different from detecting periodic patterns in traditional time series database, we discover the topics of coherent semantics and periodic characteristics where a topic is represented by a distribution of words. We propose a model called LPTA (Latent Periodic Topic Analysis) that exploits the periodicity of the terms as well as term co-occurrences. To show the effectiveness of our model, we collect several representative datasets including Seminar, DBLP and Flickr. The results show that our model can discover the latent periodic topics effectively and leverage the information from both text and time well.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127422783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bing Hu, T. Rakthanmanon, Yuan Hao, Scott Evans, S. Lonardi, Eamonn J. Keogh
{"title":"Discovering the Intrinsic Cardinality and Dimensionality of Time Series Using MDL","authors":"Bing Hu, T. Rakthanmanon, Yuan Hao, Scott Evans, S. Lonardi, Eamonn J. Keogh","doi":"10.1007/978-3-642-44958-1_14","DOIUrl":"https://doi.org/10.1007/978-3-642-44958-1_14","url":null,"abstract":"","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114429667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-task Learning for Bayesian Matrix Factorization","authors":"Chao Yuan","doi":"10.1109/ICDM.2011.107","DOIUrl":"https://doi.org/10.1109/ICDM.2011.107","url":null,"abstract":"Data sparsity is a big challenge for collaborative filtering. This problem becomes more serious if the dataset is newly created and has even fewer ratings. By sharing knowledge among different datasets, multi-task learning is a promising technique to address this issue. Most prior work methods directly share objects (users or items) across different datasets. However, object identities and correspondences may not be known in many cases. We extend the previous work of Bayesian matrix factorization with Dirichlet process mixture into a multi-task learning approach by sharing latent parameters among different tasks. Our method does not require object identities and thus is more widely applicable. The proposed model is fully non-parametric in that the dimension of latent feature vectors is automatically determined. Inference is performed using the variational Bayesian algorithm, which is much faster than Gibbs sampling used by most other related Bayesian methods.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114233728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Schmidt, Elisabeth Maria Brändle, Stefan Kramer
{"title":"Clustering with Attribute-Level Constraints","authors":"J. Schmidt, Elisabeth Maria Brändle, Stefan Kramer","doi":"10.1109/ICDM.2011.36","DOIUrl":"https://doi.org/10.1109/ICDM.2011.36","url":null,"abstract":"In many clustering applications the incorporation of background knowledge in the form of constraints is desirable. In this paper, we introduce a new constraint type and the corresponding clustering problem: attribute constrained clustering. The goal is to induce clusters of binary instances that satisfy constraints on the attribute level. These constraints specify whether instances may or may not be grouped to a cluster, depending on specific attribute values. We show how the well-established instance-level constraints, must-link and cannot-link, can be adapted to the attribute level. A variant of the k-Medoids algorithm taking into account attribute level constraints is evaluated on synthetic and real-world data. Experimental results show that such constraints may provide better clustering results at lower specification costs if constraints can be expressed on the attribute level.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128984989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LinkBoost: A Novel Cost-Sensitive Boosting Framework for Community-Level Network Link Prediction","authors":"Prakash Mandayam Comar, P. Tan, Anil K. Jain","doi":"10.1109/ICDM.2011.93","DOIUrl":"https://doi.org/10.1109/ICDM.2011.93","url":null,"abstract":"Link prediction is a challenging task due to the inherent skew ness of network data. Typical link prediction methods can be categorized as either local or global. Local methods consider the link structure in the immediate neighborhood of a node pair to determine the presence or absence of a link, whereas global methods utilize information from the whole network. This paper presents a community (cluster) level link prediction method without the need to explicitly identify the communities in a network. Specifically, a variable-cost loss function is defined to address the data skew ness problem. We provide theoretical proof that shows the equivalence between maximizing the well-known modularity measure used in community detection and minimizing a special case of the proposed loss function. As a result, any link prediction method designed to optimize the loss function would result in more links being predicted within a community than between communities. We design a boosting algorithm to minimize the loss function and present an approach to scale-up the algorithm by decomposing the network into smaller partitions and aggregating the weak learners constructed from each partition. Experimental results show that our proposed Link Boost algorithm consistently performs as good as or better than many existing methods when evaluated on 4 real-world network datasets.","PeriodicalId":106216,"journal":{"name":"2011 IEEE 11th International Conference on Data Mining","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115803593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}