Xianchao Zhang, Yansheng Jiang, Wenxin Liang, Xin Han
{"title":"Graph-Based Semi-supervised Learning with Adaptive Similarity Estimation","authors":"Xianchao Zhang, Yansheng Jiang, Wenxin Liang, Xin Han","doi":"10.1109/ICDM.2010.30","DOIUrl":"https://doi.org/10.1109/ICDM.2010.30","url":null,"abstract":"Graph-based semi-supervised learning algorithms have attracted a lot of attention. Constructing a good graph is playing an essential role for all these algorithms. Many existing graph construction methods(e.g. Gaussian Kernel etc.) require user input parameter, which is hard to configure manually. In this paper, we propose a parameter-free similarity measure Adaptive Similarity Estimation (ASE), which constructs the graph by adaptively optimizing linear combination of its neighbors. Experimental results show the effectiveness of our proposed method.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116804364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Kitts, Liang Wei, Dyng Au, Amanda Powter, Brian Burdick
{"title":"Attribution of Conversion Events to Multi-channel Media","authors":"B. Kitts, Liang Wei, Dyng Au, Amanda Powter, Brian Burdick","doi":"10.1109/ICDM.2010.161","DOIUrl":"https://doi.org/10.1109/ICDM.2010.161","url":null,"abstract":"This paper presents a practical method for measuring the impact of multiple marketing events on sales, including marketing events that are not traditionally trackable. The technique infers which of several competing media events are likely to have caused a given conversion. We test the method using hold-out sets, and also a live media experiment in which we test whether the method can accurately predict television-generated web conversions.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"51 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123703086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Information Diffusion in Implicit Networks","authors":"Jaewon Yang, J. Leskovec","doi":"10.1109/ICDM.2010.22","DOIUrl":"https://doi.org/10.1109/ICDM.2010.22","url":null,"abstract":"Social media forms a central domain for the production and dissemination of real-time information. Even though such flows of information have traditionally been thought of as diffusion processes over social networks, the underlying phenomena are the result of a complex web of interactions among numerous participants. Here we develop the Linear Influence Model where rather than requiring the knowledge of the social network and then modeling the diffusion by predicting which node will influence which other nodes in the network, we focus on modeling the global influence of a node on the rate of diffusion through the (implicit) network. We model the number of newly infected nodes as a function of which other nodes got infected in the past. For each node we estimate an influence function that quantifies how many subsequent infections can be attributed to the influence of that node over time. A nonparametric formulation of the model leads to a simple least squares problem that can be solved on large datasets. We validate our model on a set of 500 million tweets and a set of 170 million news articles and blog posts. We show that the Linear Influence Model accurately models influences of nodes and reliably predicts the temporal dynamics of information diffusion. We find that patterns of influence of individual participants differ significantly depending on the type of the node and the topic of the information.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"260 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116234622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huisheng Zhu, Peng Wang, Xianmang He, Yujiao Li, Wei Wang, Baile Shi
{"title":"Efficient Episode Mining with Minimal and Non-overlapping Occurrences","authors":"Huisheng Zhu, Peng Wang, Xianmang He, Yujiao Li, Wei Wang, Baile Shi","doi":"10.1109/ICDM.2010.25","DOIUrl":"https://doi.org/10.1109/ICDM.2010.25","url":null,"abstract":"Frequent serial episodes within an event sequence describe the behavior of users or systems about the application. Existing mining algorithms calculate the frequency of an episode based on overlapping or non-minimal occurrences, which is prone to over-counting the support of long episodes or poorly characterizing the followed-by-closely relationship over event types. In addition, due to utilizing the Apriori-style level wise approach, these algorithms are computationally expensive. In this paper, we propose an efficient algorithm MANEPI (Minimal And Non-overlapping EPIsode) for mining more interesting frequent episodes within the given event sequence. The proposed frequency measure takes both minimal and non-overlapping occurrences of an episode into consideration and ensures better mining quality. The introduced depth first search strategy with the Apriori Property for performing episode growth greatly improves the efficiency of mining long episodes because of scanning the given sequence only once and not generating candidate episodes. Moreover, an optimization technique is presented to narrow down search space and speed up the mining process. Experimental evaluation on both synthetic and real-world datasets demonstrates that our algorithms are more efficient and effective.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124243780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning a Bi-Stochastic Data Similarity Matrix","authors":"Fei Wang, Ping Li, A. König","doi":"10.1109/ICDM.2010.141","DOIUrl":"https://doi.org/10.1109/ICDM.2010.141","url":null,"abstract":"An idealized clustering algorithm seeks to learn a cluster-adjacency matrix such that, if two data points belong to the same cluster, the corresponding entry would be 1, otherwise the entry would be 0. This integer (1/0) constraint makes it difficult to find the optimal solution. We propose a relaxation on the cluster-adjacency matrix, by deriving a bi-stochastic matrix from a data similarity (e.g., kernel) matrix according to the Bregman divergence. Our general method is named the {em Bregmanian Bi-Stochastication} (BBS) algorithm. We focus on two popular choices of the Bregman divergence: the Euclidian distance and the KL divergence. Interestingly, the BBS algorithm using the KL divergence is equivalent to the Sinkhorn-Knopp (SK) algorithm for deriving bi-stochastic matrices. We show that the BBS algorithm using the Euclidian distance is closely related to the relaxed $k$-means clustering and can often produce noticeably superior clustering results than the SK algorithm (and other algorithms such as Normalized Cut), through extensive experiments on public data sets.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125846405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian Maximum Margin Clustering","authors":"Bo Dai, Bao-Gang Hu, Gang Niu","doi":"10.1109/ICDM.2010.117","DOIUrl":"https://doi.org/10.1109/ICDM.2010.117","url":null,"abstract":"Most well-known discriminative clustering models, such as spectral clustering (SC) and maximum margin clustering (MMC), are non-Bayesian. Moreover, they merely considered to embed domain-dependent prior knowledge into data-specific kernels, while other forms of prior knowledge were seldom considered in these models. In this paper, we propose a Bayesian maximum margin clustering model (BMMC) based on the low-density separation assumption, which unifies the merits of both Bayesian and discriminative approaches. In addition to stating prior distribution on functions explicitly as traditional Gaussian processes, special prior knowledge can be embedded into BMMC implicitly via the Universum set easily. Furthermore, it is much easier to solve a BMMC than an MMC since the integer variables in the optimization are eliminated. Experimental results show that the BMMC achieves comparable or even better performance than state-of-the-art clustering methods and solving BMMC is more efficiently.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"65 2 Pt 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131248443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Financial Forecasting with Gompertz Multiple Kernel Learning","authors":"Han Qin, Dejing Dou, Yue Fang","doi":"10.1109/ICDM.2010.68","DOIUrl":"https://doi.org/10.1109/ICDM.2010.68","url":null,"abstract":"Financial forecasting is the basis for budgeting activities and estimating future financing needs. Applying machine learning and data mining models to financial forecasting is both effective and efficient. Among different kinds of machine learning models, kernel methods are well accepted since they are more robust and accurate than traditional models, such as neural networks. However, learning from multiple data sources is still one of the main challenges in the financial forecasting area. In this paper, we focus on applying the multiple kernel learning models to the multiple major international stock indexes. Our experiment results indicate that applying multiple kernel learning to the financial forecasting problem suffers from both the short training period problem and non-stationary problem. Therefore we propose a novel multiple kernel learning model to address the challenge by introducing the Gompertz model and considering a non-linear combination of different kernel matrices. The experiment results show that our Gompertz multiple kernel learning model addresses the challenges and achieves better performance than the original multiple kernel learning model and single SVM models.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133673032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Location and Scatter Matching for Dataset Shift in Text Mining","authors":"Bo Chen, Wai Lam, I. Tsang, Tak-Lam Wong","doi":"10.1109/ICDM.2010.72","DOIUrl":"https://doi.org/10.1109/ICDM.2010.72","url":null,"abstract":"Dataset shift from the training data in a source domain to the data in a target domain poses a great challenge for many statistical learning methods. Most algorithms can be viewed as exploiting only the first-order statistics, namely, the empirical mean discrepancy to evaluate the distribution gap. Intuitively, considering only the empirical mean may not be statistically efficient. In this paper, we propose a non-parametric distance metric with a good property which jointly considers the empirical mean (Location) and sample covariance (Scatter) difference. More specifically, we propose an improved symmetric Stein's loss function which combines the mean and covariance discrepancy into a unified Bregman matrix divergence of which Jensen-Shannon divergence between normal distributions is a particular case. Our target is to find a good feature representation which can reduce the distribution gap between different domains, at the same time, ensure that the new derived representation can encode most discriminative components with respect to the label information. We have conducted extensive experiments on several document classification datasets to demonstrate the effectiveness of our proposed method.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123185462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining Sensor Streams for Discovering Human Activity Patterns over Time","authors":"Parisa Rashidi, D. Cook","doi":"10.1109/ICDM.2010.40","DOIUrl":"https://doi.org/10.1109/ICDM.2010.40","url":null,"abstract":"In recent years, new emerging application domains have introduced new constraints and methods in data mining field. One of such application domains is activity discovery from sensor data. Activity discovery and recognition plays an important role in a wide range of applications from assisted living to security and surveillance. Most of the current approaches for activity discovery assume a static model of the activities and ignore the problem of mining and discovering activities from a data stream over time. Inspired by the unique requirements of activity discovery application domain, in this paper we propose a new stream mining method for finding sequential patterns over time from streaming non-transaction data using multiple time granularities. Our algorithm is able to find sequential patterns, even if the patterns exhibit discontinuities (interruptions) or variations in the sequence order. Our algorithm also addresses the problem of dealing with rare events across space and over time. We validate the results of our algorithms using data collected from two different smart apartments.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"33 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124109447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Edge Weight Regularization over Multiple Graphs for Similarity Learning","authors":"Pradeep Muthukrishnan, Dragomir R. Radev, Q. Mei","doi":"10.1109/ICDM.2010.156","DOIUrl":"https://doi.org/10.1109/ICDM.2010.156","url":null,"abstract":"The growth of the web has directly influenced the increase in the availability of relational data. One of the key problems in mining such data is computing the similarity between objects with heterogeneous feature types. For example, publications have many heterogeneous features like text, citations, authorship information, venue information, etc. In most approaches, similarity is estimated using each feature type in isolation and then combined in a linear fashion. However, this approach does not take advantage of the dependencies between the different feature spaces. In this paper, we propose a novel approach to combine the different sources of similarity using a regularization framework over edges in multiple graphs. We show that the objective function induced by the framework is convex. We also propose an efficient algorithm using coordinate descent [1] to solve the optimization problem. We extrinsically evaluate the performance of the proposed unified similarity measure on two different tasks, clustering and classification. The proposed similarity measure outperforms three baselines and a state-of-the-art classification algorithm on a variety of standard, large data sets.","PeriodicalId":294061,"journal":{"name":"2010 IEEE International Conference on Data Mining","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116863608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}