Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining最新文献_第8页

Probabilistic Graphical Models of Dyslexia 阅读障碍的概率图形模型

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2788604

Yair Lakretz, Gal Chechik, N. Friedmann, M. Rosen-Zvi

引用次数: 10

The Effectiveness of Marketing Strategies in Social Media: Evidence from Promotional Events 社会化媒体营销策略的有效性:来自促销活动的证据

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2788597

Panagiotis Adamopoulos, Vilma Todri

{"title":"The Effectiveness of Marketing Strategies in Social Media: Evidence from Promotional Events","authors":"Panagiotis Adamopoulos, Vilma Todri","doi":"10.1145/2783258.2788597","DOIUrl":"https://doi.org/10.1145/2783258.2788597","url":null,"abstract":"This paper studies a novel social media venture and seeks to understand the effectiveness of marketing strategies in social media platforms by evaluating their impact on participating brands and organizations. We use a real-world data set and employ a promising research approach combining econometric with predictive modeling techniques in a causal estimation framework that allows for more accurate counterfactuals. Based on the results of the presented analysis and focusing on the long-term business value of marketing strategies in social media, we find that promotional events leveraging implicit or explicit advocacy in social media platforms result in significant abnormal returns for the participating brand, in terms of expanding the social media fan base of the firm. The effect is also economically significant as it corresponds to an increase of several thousand additional new followers per day for an average size brand. We also precisely quantify the impact of various promotion characteristics and demonstrate what types of promotions are more effective and for which brands, while suggesting specific tactical strategies. For instance, despite the competition for consumers' attention, brands and marketers should broadcast marketing messages on social networks during the time of peak usage in order to maximize their returns. Overall, we provide actionable insights with major implications for firms and social media platforms and contribute to the related literature as we discover new rich findings enabled by the employed causal estimation framework.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"644 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123045330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

An Efficient Semi-Supervised Clustering Algorithm with Sequential Constraints 一种有效的序列约束半监督聚类算法

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2783389

Jinfeng Yi, Lijun Zhang, Tianbao Yang, W. Liu, Jun Wang

{"title":"An Efficient Semi-Supervised Clustering Algorithm with Sequential Constraints","authors":"Jinfeng Yi, Lijun Zhang, Tianbao Yang, W. Liu, Jun Wang","doi":"10.1145/2783258.2783389","DOIUrl":"https://doi.org/10.1145/2783258.2783389","url":null,"abstract":"Semi-supervised clustering leverages side information such as pairwise constraints to guide clustering procedures. Despite promising progress, existing semi-supervised clustering approaches overlook the condition of side information being generated sequentially, which is a natural setting arising in numerous real-world applications such as social network and e-commerce system analysis. Given emerged new constraints, classical semi-supervised clustering algorithms need to re-optimize their objectives over all data samples and constraints in availability, which prevents them from efficiently updating the obtained data partitions. To address this challenge, we propose an efficient dynamic semi-supervised clustering framework that casts the clustering problem into a search problem over a feasible convex set, i.e., a convex hull with its extreme points being an ensemble of m data partitions. According to the principle of ensemble clustering, the optimal partition lies in the convex hull, and can thus be uniquely represented by an m-dimensional probability simplex vector. As such, the dynamic semi-supervised clustering problem is simplified to the problem of updating a probability simplex vector subject to the newly received pairwise constraints. We then develop a computationally efficient updating procedure to update the probability simplex vector in O(m2) time, irrespective of the data size n. Our empirical studies on several real-world benchmark datasets show that the proposed algorithm outperforms the state-of-the-art semi-supervised clustering algorithms with visible performance gain and significantly reduced running time.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121406598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Machine Learning and Causal Inference for Policy Evaluation 政策评估的机器学习和因果推理

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2785466

S. Athey

{"title":"Machine Learning and Causal Inference for Policy Evaluation","authors":"S. Athey","doi":"10.1145/2783258.2785466","DOIUrl":"https://doi.org/10.1145/2783258.2785466","url":null,"abstract":"A large literature on causal inference in statistics, econometrics, biostatistics, and epidemiology (see, e.g., Imbens and Rubin [2015] for a recent survey) has focused on methods for statistical estimation and inference in a setting where the researcher wishes to answer a question about the (counterfactual) impact of a change in a policy, or \"treatment\" in the terminology of the literature. The policy change has not necessarily been observed before, or may have been observed only for a subset of the population; examples include a change in minimum wage law or a change in a firm's price. The goal is then to estimate the impact of small set of \"treatments\" using data from randomized experiments or, more commonly, \"observational\" studies (that is, non-experimental data). The literature identifies a variety of assumptions that, when satisfied, allow the researcher to draw the same types of conclusions that would be available from a randomized experiment. To estimate causal effects given non-random assignment of individuals to alternative policies in observational studies, popular techniques include propensity score weighting, matching, and regression analysis; all of these methods adjust for differences in observed attributes of individuals. Another strand of literature in econometrics, referred to as \"structural modeling,\" fully specifies the preferences of actors as well as a behavioral model, and estimates those parameters from data (for applications to auction-based electronic commerce, see Athey and Haile [2007] and Athey and Nekipelov [2012]). In both cases, parameter estimates are interpreted as \"causal,\" and they are used to make predictions about the effect of policy changes. In contrast, the supervised machine learning literature has traditionally focused on prediction, providing data-driven approaches to building rich models and relying on cross-validation as a powerful tool for model selection. These methods have been highly successful in practice. This talk will review several recent papers that attempt to bring the tools of supervised machine learning to bear on the problem of policy evaluation, where the papers are connected by three themes. The first theme is that it important for both estimation and inference to distinguish between parts of the model that relate to the causal question of interest, and \"attributes,\" that is, features or variables that describe attributes of individual units that are held fixed when policies change. Specifically, we propose to divide the features of a model into causal features, whose values may be manipulated in a counterfactual policy environment, and attributes. A second theme is that relative to conventional tools from the policy evaluation literature, tools from supervised machine learning can be particularly effective at modeling the association of outcomes with attributes, as well as in modeling how causal effects vary with attributes. A final theme is that modifications of existing methods may","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115254149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 97

Discovery of Meaningful Rules in Time Series 发现时间序列中的有意义规则

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2783306

Mohammad Shokoohi-Yekta, Yanping Chen, Bilson J. L. Campana, Bing Hu, J. Zakaria, Eamonn J. Keogh

{"title":"Discovery of Meaningful Rules in Time Series","authors":"Mohammad Shokoohi-Yekta, Yanping Chen, Bilson J. L. Campana, Bing Hu, J. Zakaria, Eamonn J. Keogh","doi":"10.1145/2783258.2783306","DOIUrl":"https://doi.org/10.1145/2783258.2783306","url":null,"abstract":"The ability to make predictions about future events is at the heart of much of science; so, it is not surprising that prediction has been a topic of great interest in the data mining community for the last decade. Most of the previous work has attempted to predict the future based on the current value of a stream. However, for many problems the actual values are irrelevant, whereas the shape of the current time series pattern may foretell the future. The handful of research efforts that consider this variant of the problem have met with limited success. In particular, it is now understood that most of these efforts allow the discovery of spurious rules. We believe the reason why rule discovery in real-valued time series has failed thus far is because most efforts have more or less indiscriminately applied the ideas of symbolic stream rule discovery to real-valued rule discovery. In this work, we show why these ideas are not directly suitable for rule discovery in time series. Beyond our novel definitions/representations, which allow for meaningful and extendable specifications of rules, we further show novel algorithms that allow us to quickly discover high quality rules in very large datasets that accurately predict the occurrence of future events.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"232 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116229914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 85

Data Driven Science: SIGKDD Panel 数据驱动科学:SIGKDD小组

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2788703

K. Morik, H. Durrant-Whyte, Gary Hill, Dietmar Müller, T. Berger-Wolf

引用次数: 2

Predicting Voice Elicited Emotions 预测声音引发的情绪

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2788619

Y. Li, Jose D. Contreras, Luis J. Salazar

{"title":"Predicting Voice Elicited Emotions","authors":"Y. Li, Jose D. Contreras, Luis J. Salazar","doi":"10.1145/2783258.2788619","DOIUrl":"https://doi.org/10.1145/2783258.2788619","url":null,"abstract":"We present the research, and product development and deployment, of Voice Analyzer' by Jobaline Inc. This is a patent pending technology that analyzes voice data and predicts human emotions elicited by the paralinguistic elements of a voice. Human voice characteristics, such as tone, complement the verbal communication. In several contexts of communication, \"how\" things are said is just as important as \"what\" is being said. This paper provides an overview of our deployed system, the raw data, the data processing steps, and the prediction algorithms we experimented with. A case study is included where, given a voice clip, our model predicts the degree in which a listener will find the voice \"engaging\". Our prediction results were verified through independent market research with 75% in agreement on how an average listener would feel. One application of Jobaline Voice Analyzer technology is for assisting companies to hire workers in the service industry where customers' emotional response to workers' voice may affect the service outcome. Jobaline Voice Analyzer is deployed in production as a product offer to our clients to help them identify workers who will better engage with their customers. We will also share some discoveries and lessons learned.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"175 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124311739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

VC-Dimension and Rademacher Averages: From Statistical Learning Theory to Sampling Algorithms vc维和Rademacher平均:从统计学习理论到抽样算法

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2789984

Matteo Riondato, E. Upfal

引用次数: 4

Structured Hedging for Resource Allocations with Leverage 杠杆资源配置的结构性对冲

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2783378

Nicholas Johnson, A. Banerjee

{"title":"Structured Hedging for Resource Allocations with Leverage","authors":"Nicholas Johnson, A. Banerjee","doi":"10.1145/2783258.2783378","DOIUrl":"https://doi.org/10.1145/2783258.2783378","url":null,"abstract":"Data mining algorithms for computing solutions to online resource allocation (ORA) problems have focused on budgeting resources currently in possession, e.g., investing in the stock market with cash on hand or assigning current employees to projects. In several settings, one can leverage borrowed resources with which tasks can be accomplished more efficiently and cheaply. Additionally, a variety of opposing allocation types or positions may be available with which one can hedge the allocation to alleviate risk from external changes. In this paper, we present a formulation for hedging online resource allocations with leverage and propose an efficient data mining algorithm (SHERAL). We pose the problem as a constrained online convex optimization problem. The key novel components of our formulation are (1) a loss function for general leveraging and opposing allocation positions and (2) a penalty function which hedges between structurally dependent allocation positions to control risk. We instantiate the problem in the context of portfolio selection and evaluate the effectiveness of the formulation through extensive experiments on five datasets in comparison with existing algorithms and several variants.","PeriodicalId":243428,"journal":{"name":"Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","volume":"301 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131869422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Probabilistic Community and Role Model for Social Networks 社会网络的概率社区和角色模型

Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Pub Date : 2015-08-10 DOI: 10.1145/2783258.2783274

Yu Han, Jie Tang

引用次数: 38