Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining最新文献_第4页

Indexing and mining time sequences 索引和挖掘时间序列

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI: 10.1145/1835804.1866295

Lei Li, C. Faloutsos

引用次数: 0

Session details: Research track 20: evolving and spatial data 研究专场20:演变和空间数据

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI: 10.1145/3248800

Hui Xiong

引用次数: 0

Data winnowing 数据筛选

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI: 10.1145/1835804.1835806

Y. Freund

引用次数: 0

A scalable two-stage approach for a class of dimensionality reduction techniques 一类降维技术的可扩展两阶段方法

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI: 10.1145/1835804.1835846

Liang Sun, Betul Ceran, Jieping Ye

{"title":"A scalable two-stage approach for a class of dimensionality reduction techniques","authors":"Liang Sun, Betul Ceran, Jieping Ye","doi":"10.1145/1835804.1835846","DOIUrl":"https://doi.org/10.1145/1835804.1835846","url":null,"abstract":"Dimensionality reduction plays an important role in many data mining applications involving high-dimensional data. Many existing dimensionality reduction techniques can be formulated as a generalized eigenvalue problem, which does not scale to large-size problems. Prior work transforms the generalized eigenvalue problem into an equivalent least squares formulation, which can then be solved efficiently. However, the equivalence relationship only holds under certain assumptions without regularization, which severely limits their applicability in practice. In this paper, an efficient two-stage approach is proposed to solve a class of dimensionality reduction techniques, including Canonical Correlation Analysis, Orthonormal Partial Least Squares, linear Discriminant Analysis, and Hypergraph Spectral Learning. The proposed two-stage approach scales linearly in terms of both the sample size and data dimensionality. The main contributions of this paper include (1) we rigorously establish the equivalence relationship between the proposed two-stage approach and the original formulation without any assumption; and (2) we show that the equivalence relationship still holds in the regularization setting. We have conducted extensive experiments using both synthetic and real-world data sets. Our experimental results confirm the equivalence relationship established in this paper. Results also demonstrate the scalability of the proposed two-stage approach.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82384231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 42

Negative correlations in collaboration: concepts and algorithms 协作中的负相关:概念和算法

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI: 10.1145/1835804.1835864

Jinyan Li, Qian Liu, Tao Zeng

引用次数: 7

Transfer metric learning by learning task relationships 通过学习任务关系迁移度量学习

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI: 10.1145/1835804.1835954

Yu Zhang, D. Yeung

{"title":"Transfer metric learning by learning task relationships","authors":"Yu Zhang, D. Yeung","doi":"10.1145/1835804.1835954","DOIUrl":"https://doi.org/10.1145/1835804.1835954","url":null,"abstract":"Distance metric learning plays a very crucial role in many data mining algorithms because the performance of an algorithm relies heavily on choosing a good metric. However, the labeled data available in many applications is scarce and hence the metrics learned are often unsatisfactory. In this paper, we consider a transfer learning setting in which some related source tasks with labeled data are available to help the learning of the target task. We first propose a convex formulation for multi-task metric learning by modeling the task relationships in the form of a task covariance matrix. Then we regard transfer learning as a special case of multi-task learning and adapt the formulation of multi-task metric learning to the transfer learning setting for our method, called transfer metric learning (TML). In TML, we learn the metric and the task covariances between the source tasks and the target task under a unified convex formulation. To solve the convex optimization problem, we use an alternating method in which each subproblem has an efficient solution. Experimental results on some commonly used transfer learning applications demonstrate the effectiveness of our method.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89379051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 102

Finding effectors in social networks 在社交网络中寻找效应

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI: 10.1145/1835804.1835937

Theodoros Lappas, Evimaria Terzi, D. Gunopulos, H. Mannila

引用次数: 216

Document clustering via dirichlet process mixture model with feature selection 基于特征选择的dirichlet过程混合模型的文档聚类

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI: 10.1145/1835804.1835901

Guan Yu, Rui-zhang Huang, Zhaojun Wang

引用次数: 61

Cold start link prediction 冷启动链路预测

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI: 10.1145/1835804.1835855

V. Leroy, B. B. Cambazoglu, F. Bonchi

引用次数: 171

Grafting-light: fast, incremental feature selection and structure learning of Markov random fields 嫁接轻:快速，增量特征选择和马尔可夫随机场的结构学习

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI: 10.1145/1835804.1835845

Jun Zhu, N. Lao, E. Xing

{"title":"Grafting-light: fast, incremental feature selection and structure learning of Markov random fields","authors":"Jun Zhu, N. Lao, E. Xing","doi":"10.1145/1835804.1835845","DOIUrl":"https://doi.org/10.1145/1835804.1835845","url":null,"abstract":"Feature selection is an important task in order to achieve better generalizability in high dimensional learning, and structure learning of Markov random fields (MRFs) can automatically discover the inherent structures underlying complex data. Both problems can be cast as solving an l1-norm regularized parameter estimation problem. The existing Grafting method can avoid doing inference on dense graphs in structure learning by incrementally selecting new features. However, Grafting performs a greedy step to optimize over free parameters once new features are included. This greedy strategy results in low efficiency when parameter learning is itself non-trivial, such as in MRFs, in which parameter learning depends on an expensive subroutine to calculate gradients. The complexity of calculating gradients in MRFs is typically exponential to the size of maximal cliques. In this paper, we present a fast algorithm called Grafting-Light to solve the l1-norm regularized maximum likelihood estimation of MRFs for efficient feature selection and structure learning. Grafting-Light iteratively performs one-step of orthant-wise gradient descent over free parameters and selects new features. This lazy strategy is guaranteed to converge to the global optimum and can effectively select significant features. On both synthetic and real data sets, we show that Grafting-Light is much more efficient than Grafting for both feature selection and structure learning, and performs comparably with the optimal batch method that directly optimizes over all the features for feature selection but is much more efficient and accurate for structure learning of MRFs.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88594401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 40