Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining最新文献

筛选
英文 中文
Grafting-light: fast, incremental feature selection and structure learning of Markov random fields 嫁接轻:快速,增量特征选择和马尔可夫随机场的结构学习
Jun Zhu, N. Lao, E. Xing
{"title":"Grafting-light: fast, incremental feature selection and structure learning of Markov random fields","authors":"Jun Zhu, N. Lao, E. Xing","doi":"10.1145/1835804.1835845","DOIUrl":"https://doi.org/10.1145/1835804.1835845","url":null,"abstract":"Feature selection is an important task in order to achieve better generalizability in high dimensional learning, and structure learning of Markov random fields (MRFs) can automatically discover the inherent structures underlying complex data. Both problems can be cast as solving an l1-norm regularized parameter estimation problem. The existing Grafting method can avoid doing inference on dense graphs in structure learning by incrementally selecting new features. However, Grafting performs a greedy step to optimize over free parameters once new features are included. This greedy strategy results in low efficiency when parameter learning is itself non-trivial, such as in MRFs, in which parameter learning depends on an expensive subroutine to calculate gradients. The complexity of calculating gradients in MRFs is typically exponential to the size of maximal cliques. In this paper, we present a fast algorithm called Grafting-Light to solve the l1-norm regularized maximum likelihood estimation of MRFs for efficient feature selection and structure learning. Grafting-Light iteratively performs one-step of orthant-wise gradient descent over free parameters and selects new features. This lazy strategy is guaranteed to converge to the global optimum and can effectively select significant features. On both synthetic and real data sets, we show that Grafting-Light is much more efficient than Grafting for both feature selection and structure learning, and performs comparably with the optimal batch method that directly optimizes over all the features for feature selection but is much more efficient and accurate for structure learning of MRFs.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88594401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Session details: Research track 5: classification models and tools 专题研究5:分类模型和工具
Gregory Piatetsky
{"title":"Session details: Research track 5: classification models and tools","authors":"Gregory Piatetsky","doi":"10.1145/3248785","DOIUrl":"https://doi.org/10.1145/3248785","url":null,"abstract":"","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72832232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An energy-efficient mobile recommender system 高效节能的移动推荐系统
Yong Ge, Hui Xiong, A. Tuzhilin, Keli Xiao, M. Gruteser, M. Pazzani
{"title":"An energy-efficient mobile recommender system","authors":"Yong Ge, Hui Xiong, A. Tuzhilin, Keli Xiao, M. Gruteser, M. Pazzani","doi":"10.1145/1835804.1835918","DOIUrl":"https://doi.org/10.1145/1835804.1835918","url":null,"abstract":"The increasing availability of large-scale location traces creates unprecedent opportunities to change the paradigm for knowledge discovery in transportation systems. A particularly promising area is to extract energy-efficient transportation patterns (green knowledge), which can be used as guidance for reducing inefficiencies in energy consumption of transportation sectors. However, extracting green knowledge from location traces is not a trivial task. Conventional data analysis tools are usually not customized for handling the massive quantity, complex, dynamic, and distributed nature of location traces. To that end, in this paper, we provide a focused study of extracting energy-efficient transportation patterns from location traces. Specifically, we have the initial focus on a sequence of mobile recommendations. As a case study, we develop a mobile recommender system which has the ability in recommending a sequence of pick-up points for taxi drivers or a sequence of potential parking positions. The goal of this mobile recommendation system is to maximize the probability of business success. Along this line, we provide a Potential Travel Distance (PTD) function for evaluating each candidate sequence. This PTD function possesses a monotone property which can be used to effectively prune the search space. Based on this PTD function, we develop two algorithms, LCP and SkyRoute, for finding the recommended routes. Finally, experimental results show that the proposed system can provide effective mobile sequential recommendation and the knowledge extracted from location traces can be used for coaching drivers and leading to the efficient use of energy.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77045857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 369
Combined regression and ranking 组合回归与排序
D. Sculley
{"title":"Combined regression and ranking","authors":"D. Sculley","doi":"10.1145/1835804.1835928","DOIUrl":"https://doi.org/10.1145/1835804.1835928","url":null,"abstract":"Many real-world data mining tasks require the achievement of two distinct goals when applied to unseen data: first, to induce an accurate preference ranking, and second to give good regression performance. In this paper, we give an efficient and effective Combined Regression and Ranking method (CRR) that optimizes regression and ranking objectives simultaneously. We demonstrate the effectiveness of CRR for both families of metrics on a range of large-scale tasks, including click prediction for online advertisements. Results show that CRR often achieves performance equivalent to the best of both ranking-only and regression-only approaches. In the case of rare events or skewed distributions, we also find that this combination can actually improve regression performance due to the addition of informative ranking constraints.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77264851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 155
Dynamics of conversations 对话的动态
Ravi Kumar, Mohammad Mahdian, Mary McGlohon
{"title":"Dynamics of conversations","authors":"Ravi Kumar, Mohammad Mahdian, Mary McGlohon","doi":"10.1145/1835804.1835875","DOIUrl":"https://doi.org/10.1145/1835804.1835875","url":null,"abstract":"How do online conversations build? Is there a common model that human communication follows? In this work we explore these questions in detail. We analyze the structure of conversations in three different social datasets, namely, Usenet groups, Yahoo! Groups, and Twitter. We propose a simple mathematical model for the generation of basic conversation structures and then refine this model to take into account the identities of each member of the conversation.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87226459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 130
Session details: Research track 10: graph mining and classification 会议细节:研究专题10:图挖掘和分类
J. Pei
{"title":"Session details: Research track 10: graph mining and classification","authors":"J. Pei","doi":"10.1145/3248790","DOIUrl":"https://doi.org/10.1145/3248790","url":null,"abstract":"","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"78 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83046338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature selection for support vector regression using probabilistic prediction 基于概率预测的支持向量回归特征选择
Jian-Bo Yang, C. Ong
{"title":"Feature selection for support vector regression using probabilistic prediction","authors":"Jian-Bo Yang, C. Ong","doi":"10.1145/1835804.1835849","DOIUrl":"https://doi.org/10.1145/1835804.1835849","url":null,"abstract":"This paper presents a novel wrapper-based feature selection method for Support Vector Regression (SVR) using its probabilistic predictions. The method computes the importance of a feature by aggregating the difference, over the feature space, of the conditional density functions of the SVR prediction with and without the feature. As the exact computation of this importance measure is expensive, two approximations are proposed. The effectiveness of the measure using these approximations, in comparison to several other existing feature selection methods for SVR, is evaluated on both artificial and real-world problems. The result of the experiment shows that the proposed method generally performs better, and at least as well as the existing methods, with notable advantage when the data set is sparse.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82067594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Privacy-preserving outsourcing support vector machines with random transformation 具有随机变换的隐私保护外包支持向量机
Keng-Pei Lin, Ming-Syan Chen
{"title":"Privacy-preserving outsourcing support vector machines with random transformation","authors":"Keng-Pei Lin, Ming-Syan Chen","doi":"10.1145/1835804.1835852","DOIUrl":"https://doi.org/10.1145/1835804.1835852","url":null,"abstract":"Outsourcing the training of support vector machines (SVM) to external service providers benefits the data owner who is not familiar with the techniques of the SVM or has limited computing resources. In outsourcing, the data privacy is a critical issue for some legal or commercial reasons since there may be sensitive information contained in the data. Existing privacy-preserving SVM works are either not applicable to outsourcing or weak in security. In this paper, we propose a scheme for privacy-preserving outsourcing the training of the SVM without disclosing the actual content of the data to the service provider. In the proposed scheme, the data sent to the service provider is perturbed by a random transformation, and the service provider trains the SVM for the data owner from the perturbed data. The proposed scheme is stronger in security than existing techniques, and incurs very little redundant communication and computation cost.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82110844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Evaluating online ad campaigns in a pipeline: causal models at scale 评估在线广告活动:大规模的因果模型
David Chan, Rong Ge, Ori Gershony, Tim Hesterberg, D. Lambert
{"title":"Evaluating online ad campaigns in a pipeline: causal models at scale","authors":"David Chan, Rong Ge, Ori Gershony, Tim Hesterberg, D. Lambert","doi":"10.1145/1835804.1835809","DOIUrl":"https://doi.org/10.1145/1835804.1835809","url":null,"abstract":"Display ads proliferate on the web, but are they effective? Or are they irrelevant in light of all the other advertising that people see? We describe a way to answer these questions, quickly and accurately, without randomized experiments, surveys, focus groups or expert data analysts. Doubly robust estimation protects against the selection bias that is inherent in observational data, and a nonparametric test that is based on irrelevant outcomes provides further defense. Simulations based on realistic scenarios show that the resulting estimates are more robust to selection bias than traditional alternatives, such as regression modeling or propensity scoring. Moreover, computations are fast enough that all processing, from data retrieval through estimation, testing, validation and report generation, proceeds in an automated pipeline, without anyone needing to see the raw data.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81439996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 110
Multi-task learning for boosting with application to web search ranking 多任务学习提高与应用程序的网页搜索排名
O. Chapelle, Pannagadatta K. Shivaswamy, Srinivas Vadrevu, Kilian Q. Weinberger, Ya Zhang, Belle L. Tseng
{"title":"Multi-task learning for boosting with application to web search ranking","authors":"O. Chapelle, Pannagadatta K. Shivaswamy, Srinivas Vadrevu, Kilian Q. Weinberger, Ya Zhang, Belle L. Tseng","doi":"10.1145/1835804.1835953","DOIUrl":"https://doi.org/10.1145/1835804.1835953","url":null,"abstract":"In this paper we propose a novel algorithm for multi-task learning with boosted decision trees. We learn several different learning tasks with a joint model, explicitly addressing the specifics of each learning task with task-specific parameters and the commonalities between them through shared parameters. This enables implicit data sharing and regularization. We evaluate our learning method on web-search ranking data sets from several countries. Here, multitask learning is particularly helpful as data sets from different countries vary largely in size because of the cost of editorial judgments. Our experiments validate that learning various tasks jointly can lead to significant improvements in performance with surprising reliability.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"79 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88265918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 111
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信