Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining最新文献_第9页

Accelerated Alternating Direction Method of Multipliers 加速交替乘数法

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2783400

Mojtaba Kadkhodaie, Konstantina Christakopoulou, Maziar Sanjabi, A. Banerjee

{"title":"Accelerated Alternating Direction Method of Multipliers","authors":"Mojtaba Kadkhodaie, Konstantina Christakopoulou, Maziar Sanjabi, A. Banerjee","doi":"10.1145/2783258.2783400","DOIUrl":"https://doi.org/10.1145/2783258.2783400","url":null,"abstract":"Recent years have seen a revival of interest in the Alternating Direction Method of Multipliers (ADMM), due to its simplicity, versatility, and scalability. As a first order method for general convex problems, the rate of convergence of ADMM is O(1=k) [4, 25]. Given the scale of modern data mining problems, an algorithm with similar properties as ADMM but faster convergence rate can make a big difference in real world applications. In this paper, we introduce the Accelerated Alternating Direction Method of Multipliers (A2DM2) which solves problems with the same structure as ADMM. When the objective function is strongly convex, we show that A2DM2 has a O(1=k2) convergence rate. Unlike related existing literature on trying to accelerate ADMM, our analysis does not need any additional restricting assumptions. Through experiments, we show that A2DM2 converges faster than ADMM on a variety of problems. Further, we illustrate the versatility of the general A2DM2 on the problem of learning to rank, where it is shown to be competitive with the state-of-the-art specialized algorithms for the problem on both scalability and accuracy.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129875053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 58

A Collective Bayesian Poisson Factorization Model for Cold-start Local Event Recommendation 冷启动局部事件推荐的集体贝叶斯泊松分解模型

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2783336

Wei Zhang, Jianyong Wang

{"title":"A Collective Bayesian Poisson Factorization Model for Cold-start Local Event Recommendation","authors":"Wei Zhang, Jianyong Wang","doi":"10.1145/2783258.2783336","DOIUrl":"https://doi.org/10.1145/2783258.2783336","url":null,"abstract":"Event-based social networks (EBSNs), in which organizers publish events to attract other users in local city to attend offline, emerge in recent years and grow rapidly. Due to the large volume of events in EBSNs, event recommendation is essential. A few recent works focus on this task, while almost all the methods need that each event to be recommended should have been registered by some users to attend. Thus they ignore two essential characteristics of events in EBSNs: (1) a large number of new events will be published every day which means many events have few participants in the beginning, (2) events have life cycles which means outdated events should not be recommended. Overall, event recommendation in EBSNs inevitably faces the cold-start problem. In this work, we address the new problem of cold-start local event recommendation in EBSNs. We propose a collective Bayesian Poisson factorization (CBPF) model for handling this problem. CBPF takes recently proposed Bayesian Poisson factorization as its basic unit to model user response to events, social relation, and content text separately. Then it further jointly connects these units by the idea of standard collective matrix factorization model. Moreover, in our model event textual content, organizer, and location information are utilized to learn representation of cold-start events for predicting user response to them. Besides, an efficient coordinate ascent algorithm is adopted to learn the model. We conducted comprehensive experiments on real datasets crawled from EBSNs and the results demonstrate our proposed model is effective and outperforms several alternative methods.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134504718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 97

Adaptive Message Update for Fast Affinity Propagation 用于快速关联传播的自适应消息更新

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2783280

Y. Fujiwara, M. Nakatsuji, Hiroaki Shiokawa, Yasutoshi Ida, Machiko Toyoda

引用次数: 21

Big Data Analytics: Optimization and Randomization 大数据分析:优化和随机化

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2789989

Tianbao Yang, Qihang Lin, Rong Jin

{"title":"Big Data Analytics: Optimization and Randomization","authors":"Tianbao Yang, Qihang Lin, Rong Jin","doi":"10.1145/2783258.2789989","DOIUrl":"https://doi.org/10.1145/2783258.2789989","url":null,"abstract":"As the scale and dimensionality of data continue to grow in many applications of data analytics (e.g., bioinformatics, finance, computer vision, medical informatics), it becomes critical to develop efficient and effective algorithms to solve numerous machine learning and data mining problems. This tutorial will focus on simple yet practically effective techniques and algorithms for big data analytics. In the first part, we plan to present the state-of-the-art large-scale optimization algorithms, including various stochastic gradient descent methods, stochastic coordinate descent methods and distributed optimization algorithms, for solving various machine learning problems. In the second part, we will focus on randomized approximation algorithms for learning from large-scale data. We will discuss i) randomized algorithms for low-rank matrix approximation; ii) approximation techniques for solving kernel learning problems; iii) randomized reduction methods for addressing the high-dimensional challenge. Along with the description of algorithms, we will also present some empirical results to facilitate understanding of different algorithms and comparison between them.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"251 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132187018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Voltage Correlations in Smart Meter Data 智能电表数据中的电压相关性

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2788594

R. Mitra, Ramachandra Kota, S. Bandyopadhyay, V. Arya, B. Sullivan, Richard Mueller, H. Storey, Gerard Labut

{"title":"Voltage Correlations in Smart Meter Data","authors":"R. Mitra, Ramachandra Kota, S. Bandyopadhyay, V. Arya, B. Sullivan, Richard Mueller, H. Storey, Gerard Labut","doi":"10.1145/2783258.2788594","DOIUrl":"https://doi.org/10.1145/2783258.2788594","url":null,"abstract":"The connectivity model of a power distribution network can easily become outdated due to system changes occurring in the field. Maintaining and sustaining an accurate connectivity model is a key challenge for distribution utilities worldwide. This work shows that voltage time series measurements collected from customer smart meters exhibit correlations that are consistent with the hierarchical structure of the distribution network. These correlations may be leveraged to cluster customers based on common ancestry and help verify and correct an existing connectivity model. Additionally, customers may be clustered in combination with voltage data from circuit metering points, spatial data from the geographical information system, and any existing but partially accurate connectivity model to infer customer to transformer and phase connectivity relationships with high accuracy. We report analysis and validation results based on data collected from multiple feeders of a large electric distribution network in North America. To the best of our knowledge, this is the first large scale measurement study of customer voltage data and its use in inferring network connectivity information.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130252552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

Flexible and Robust Multi-Network Clustering 灵活和健壮的多网络聚类

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2783262

Jingchao Ni, Hanghang Tong, Wei Fan, Xiang Zhang

{"title":"Flexible and Robust Multi-Network Clustering","authors":"Jingchao Ni, Hanghang Tong, Wei Fan, Xiang Zhang","doi":"10.1145/2783258.2783262","DOIUrl":"https://doi.org/10.1145/2783258.2783262","url":null,"abstract":"Integrating multiple graphs (or networks) has been shown to be a promising approach to improve the graph clustering accuracy. Various multi-view and multi-domain graph clustering methods have recently been developed to integrate multiple networks. In these methods, a network is treated as a view or domain.The key assumption is that there is a common clustering structure shared across all views (domains), and different views (domains) provide compatible and complementary information on this underlying clustering structure. However, in many emerging real-life applications, different networks have different data distributions, where the assumption that all networks share a single common clustering structure does not hold. In this paper, we propose a flexible and robust framework that allows multiple underlying clustering structures across different networks. Our method models the domain similarity as a network, which can be utilized to regularize the clustering structures in different networks. We refer to such a data model as a network of networks (NoN). We develop NoNClus, a novel method based on non-negative matrix factorization (NMF), to cluster an NoN. We provide rigorous theoretical analysis of NoNClus in terms of its correctness, convergence and complexity. Extensive experimental results on synthetic and real-life datasets show the effectiveness of our method.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131850996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 50

How Artificial Intelligence and Big Data Created Rocket Fuel: A Case Study 人工智能和大数据如何创造火箭燃料:一个案例研究

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2790458

George H. John

{"title":"How Artificial Intelligence and Big Data Created Rocket Fuel: A Case Study","authors":"George H. John","doi":"10.1145/2783258.2790458","DOIUrl":"https://doi.org/10.1145/2783258.2790458","url":null,"abstract":"In 2008, Rocket Fuel's founders saw a gap in the digital advertising market. None of the existing players were building autonomous systems based on big data and artificial intelligence, but instead they were offering fairly simple technology and relying on human campaign managers to drive success. Five years later in 2013, Rocket Fuel had the best technology IPO of the year on NASDAQ, reported $240 million in revenue, and was ranked by accounting firm Deloitte as the #1 fastest-growing technology company in North America. Along the way we learned that it's okay to be bold in our expectations of what is possible with fully autonomous systems, we learned that mainstream customers will buy advanced technology if it's delivered in a familiar way, and we also learned that it's incredibly difficult to debug the complex \"robot psychology\" when a number of complex autonomous systems interact. We also had excellent luck and timing: as we were building the company, real-time ad impression-level auctions with machine-to-machine buying and selling became commonplace, and marketers became increasingly focused on delivering better results for their company and delivering better personalized and relevant digital experiences for their customers. The case study presentation will present a fast-paced overview of the business and technology context for Rocket Fuel at inception and at present, key learnings and decisions, and the road ahead.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131934442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Web Personalization and Recommender Systems 网络个性化和推荐系统

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2789995

S. Berkovsky, J. Freyne

引用次数: 37

Exploiting Data Mining for Authenticity Assessment and Protection of High-Quality Italian Wines from Piedmont 基于数据挖掘的意大利皮埃蒙特优质葡萄酒真实性评估与保护

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2788596

M. Arlorio, J. Coïsson, G. Leonardi, M. Locatelli, L. Portinale

{"title":"Exploiting Data Mining for Authenticity Assessment and Protection of High-Quality Italian Wines from Piedmont","authors":"M. Arlorio, J. Coïsson, G. Leonardi, M. Locatelli, L. Portinale","doi":"10.1145/2783258.2788596","DOIUrl":"https://doi.org/10.1145/2783258.2788596","url":null,"abstract":"This paper discusses the data mining approach followed in a project called TRAQUASwine, aimed at the definition of methods for data analytical assessment of the authenticity and protection, against fake versions, of some of the highest value Nebbiolo-based wines from Piedmont region in Italy. This is a big issue in the wine market, where commercial frauds related to such a kind of products are estimated to be worth millions of Euros. The objective is twofold: to show that the problem can be addressed without expensive and hyper-specialized wine analyses, and to demonstrate the actual usefulness of classification algorithms for data mining on the resulting chemical profiles. Following Wagstaff's proposal for practical exploitation of machine learning (and data mining) approaches, we describe how data have been collected and prepared for the production of different datasets, how suitable classification models have been identified and how the interpretation of the results suggests the emergence of an active role of classification techniques, based on standard chemical profiling, for the assesment of the authenticity of the wines target of the study.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134326692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

On the Discovery of Evolving Truth 论演化真理的发现

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2783277

Yaliang Li, Qi Li, Jing Gao, Lu Su, Bo Zhao, Wei Fan, Jiawei Han

{"title":"On the Discovery of Evolving Truth","authors":"Yaliang Li, Qi Li, Jing Gao, Lu Su, Bo Zhao, Wei Fan, Jiawei Han","doi":"10.1145/2783258.2783277","DOIUrl":"https://doi.org/10.1145/2783258.2783277","url":null,"abstract":"In the era of big data, information regarding the same objects can be collected from increasingly more sources. Unfortunately, there usually exist conflicts among the information coming from different sources. To tackle this challenge, truth discovery, i.e., to integrate multi-source noisy information by estimating the reliability of each source, has emerged as a hot topic. In many real world applications, however, the information may come sequentially, and as a consequence, the truth of objects as well as the reliability of sources may be dynamically evolving. Existing truth discovery methods, unfortunately, cannot handle such scenarios. To address this problem, we investigate the temporal relations among both object truths and source reliability, and propose an incremental truth discovery framework that can dynamically update object truths and source weights upon the arrival of new data. Theoretical analysis is provided to show that the proposed method is guaranteed to converge at a fast rate. The experiments on three real world applications and a set of synthetic data demonstrate the advantages of the proposed method over state-of-the-art truth discovery methods.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130579659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 136