Proceedings of the Ninth ACM International Conference on Web Search and Data Mining最新文献_第4页

Improving IP Geolocation using Query Logs 使用查询日志改进IP地理定位

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2016-02-08 DOI: 10.1145/2835776.2835820

Ovidiu Dan, Vaibhav Parikh, Brian D. Davison

引用次数: 24

Multi-Score Position Auctions 多分位拍卖

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2016-02-08 DOI: 10.1145/2835776.2835822

D. Charles, Nikhil R. Devanur, Balasubramanian Sivan

{"title":"Multi-Score Position Auctions","authors":"D. Charles, Nikhil R. Devanur, Balasubramanian Sivan","doi":"10.1145/2835776.2835822","DOIUrl":"https://doi.org/10.1145/2835776.2835822","url":null,"abstract":"In this paper we propose a general family of position auctions used in paid search, which we call multi-score position auctions. These auctions contain the GSP auction and the GSP auction with squashing as special cases. We show experimentally that these auctions contain special cases that perform better than the GSP auction with squashing, in terms of revenue, and the number of clicks on ads. In particular, we study in detail the special case that squashes the first slot alone and show that this beats pure squashing (which squashes all slots uniformly). We study the equilibria that arise in this special case to examine both the first order and the second order effect of moving from the squashing-all-slots auction to the squash-only-the-top-slot auction. For studying the second order effect, we simulate auctions using the value-relevance correlated distribution suggested in Lahaie and Pennock [2007]. Since this distribution is derived from a study of value and relevance distributions in Yahoo! we believe the insights derived from this simulation to be valuable. For measuring the first order effect, in addition to the said simulation, we also conduct experiments using auction data from Bing over several weeks that includes a random sample of all auctions.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77608774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Session details: Observing Users 会话详细信息:观察用户

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2016-02-08 DOI: 10.1145/3253875

M. Lalmas

引用次数: 0

Understanding Diffusion Processes: Inference and Theory 理解扩散过程:推理和理论

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2016-02-08 DOI: 10.1145/2835776.2855084

Xinran He

{"title":"Understanding Diffusion Processes: Inference and Theory","authors":"Xinran He","doi":"10.1145/2835776.2855084","DOIUrl":"https://doi.org/10.1145/2835776.2855084","url":null,"abstract":"With increasing popularity of social media and social networks sites, analyzing the social networks offers great potential to shed light on human social structure and provides great marketing opportunities. Usually, social network analysis starts with extracting or learning the social network and the associated parameters. Contrary to other analytical tasks, this step is highly non-trivial due to amorphous nature of social ties and the challenges of noisy and incomplete observations. My research focuses on improving accuracy in inferring the network as well as analyzing the consequences when the extracted network is noisy or erroneous. To be more precise, I propose to study the following two questions with a special focus on analyzing diffusion behaviors: (1) How to utilize special properties of social networks to improve accuracy of the extracted network under noisy and missing data; (2) How to characterize the impact of noise in the inferred network and carry out robust analysis and optimization. Usually the first step towards social influence analysis is to infer the diffusion network. Assuming a probabilistic model of influence and a model of how the timing of individuals’ adoption decisions correlates, one can use these data to estimate the strengths of influence between pairs of individuals. However, existing approaches for Network Inference rely on the common assumption that the observations used to train the models are complete, while missing observations are commonplace in practice due to time or technical limitations in data collection. Therefore, I propose to study the impact of incomplete observations and design efficient method to compensate for noise or incompleteness in observed data. I propose to exploit the fact that social networks have more specific structure than arbitrary graphs. A joint estimation of the graph generation model and the actual network structure is likely to significantly improve the estimation accuracy. Moreover, incorporating the content information of the cascade also has potential to improve the inference accuracy. Therefore, I propose to combine the Correlated Topic Model [1] and Hawkes Process [5, 4, 6] into a unified model to utilize content information [2]. Due to noise or missing data in the observations, even in the best case, one would expect that the inferred network structure and link strengths will only be an approximation to the truth; in other words, noise in the data will be pervasive for inferred social networks. I propose to focus on the algorithmic question of Influence Maximization [3] in the context of noisy social network data. More specifically, I propose to consider the following questions: Given an instance of an Influence Model, with level of mis-estimation: (1) Decide whether the objective function on this instance varies smoothly with perturbations to the parameters. (2) If the dependence is smooth, how to find a robustly nearoptimal solution.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"02 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88843001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DiFacto: Distributed Factorization Machines DiFacto:分布式分解机器

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2016-02-08 DOI: 10.1145/2835776.2835781

Mu Li, Ziqi Liu, Alex Smola, Yu-Xiang Wang

引用次数: 55

Affective Computing of Image Emotion Perceptions 图像情感感知的情感计算

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2016-02-08 DOI: 10.1145/2835776.2855082

Sicheng Zhao

{"title":"Affective Computing of Image Emotion Perceptions","authors":"Sicheng Zhao","doi":"10.1145/2835776.2855082","DOIUrl":"https://doi.org/10.1145/2835776.2855082","url":null,"abstract":"Images can convey rich semantics and evoke strong emotions in viewers. The research of my PhD thesis focuses on image emotion computing (IEC), which aims to predict the emotion perceptions of given images. The development of IEC is greatly constrained by two main challenges: affective gap and subjective evaluation [5]. Previous works mainly focused on finding features that can express emotions better to bridge the affective gap, such as elements-of-art based features [2] and shape features [1]. Based on the emotion representation models, including categorical emotion states (CES) and dimensional emotion space (DES) [5], three different tasks are traditionally performed on IEC: affective image classification, regression and retrieval. The state-of-the-art methods on the three above tasks are image-centric, focusing on the dominant emotions for the majority of viewers. For my PhD thesis, I plan to answer the following questions: 1. Compared to the low-level elements-of-art based features, can we find some higher level features that are more interpretable and have stronger link to emotions? 2. Are the emotions that are evoked in viewers by an image subjective and different? If they are, how can we tackle the user-centric emotion prediction? 3. For imagecentric emotion computing, can we predict the emotion distribution instead of the dominant emotion category? 1. The artistic elements must be carefully arranged and orchestrated into meaningful regions and images to describe specific semantics and emotions. The rules, tools or guidelines of arranging and orchestrating the elements-of-art in an artwork are known as the principles-of-art, which consider various artistic aspects, including balance, emphasis, harmony, variety, gradation, movement, rhythm, and proportion [5]. We systematically study and formulize the former 6 artistic principles, explaining the concepts and translating these concepts into mathematical formulae. 2. The images in Abstract dataset [2] were labeled by 14 people on average. 81% images are assigned with 5 to 8 emotions. So the perceived emotions of different viewers may vary. To further demonstrate this observation, we set up a large-scale dataset, named Image-Emotion-Social-Net dataset, with over 1 million images downloaded from Flickr. To get the personalized emotion labels, firstly we use traditional lexicon-based methods as in [4] to obtain the text segmentation results of the title, tags and descrip-","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"63 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84875117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Barbara Made the News: Mining the Behavior of Crowds for Time-Aware Learning to Rank 芭芭拉上了新闻:挖掘人群的行为，以便有时间意识地学习排名

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2016-02-08 DOI: 10.1145/2835776.2835825

Flávio Martins, João Magalhães, Jamie Callan

{"title":"Barbara Made the News: Mining the Behavior of Crowds for Time-Aware Learning to Rank","authors":"Flávio Martins, João Magalhães, Jamie Callan","doi":"10.1145/2835776.2835825","DOIUrl":"https://doi.org/10.1145/2835776.2835825","url":null,"abstract":"In Twitter, and other microblogging services, the generation of new content by the crowd is often biased towards immediacy: what is happening now. Prompted by the propagation of commentary and information through multiple mediums, users on the Web interact with and produce new posts about newsworthy topics and give rise to trending topics. This paper proposes to leverage on the behavioral dynamics of users to estimate the most relevant time periods for a topic. Our hypothesis stems from the fact that when a real-world event occurs it usually has peak times on the Web: a higher volume of tweets, new visits and edits to related Wikipedia articles, and news published about the event. In this paper, we propose a novel time-aware ranking model that leverages on multiple sources of crowd signals. Our approach builds on two major novelties. First, a unifying approach that given query q, mines and represents temporal evidence from multiple sources of crowd signals. This allows us to predict the temporal relevance of documents for query q. Second, a principled retrieval model that integrates temporal signals in a learning to rank framework, to rank results according to the predicted temporal relevance. Evaluation on the TREC 2013 and 2014 Microblog track datasets demonstrates that the proposed model achieves a relative improvement of 13.2% over lexical retrieval models and 6.2% over a learning to rank baseline.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75870424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Enforcing k-anonymity in Web Mail Auditing 在Web邮件审计中实施k-匿名

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2016-02-08 DOI: 10.1145/2835776.2835803

Dotan Di Castro, L. Lewin-Eytan, Y. Maarek, R. Wolff, Eyal Zohar

{"title":"Enforcing k-anonymity in Web Mail Auditing","authors":"Dotan Di Castro, L. Lewin-Eytan, Y. Maarek, R. Wolff, Eyal Zohar","doi":"10.1145/2835776.2835803","DOIUrl":"https://doi.org/10.1145/2835776.2835803","url":null,"abstract":"We study the problem of k-anonymization of mail messages in the realistic scenario of auditing mail traffic in a major commercial Web mail service. Mail auditing is necessary in various Web mail debugging and quality assurance activities, such as anti-spam or the qualitative evaluation of novel mail features. It is conducted by trained professionals, often referred to as \"auditors\", who are shown messages that could expose personally identifiable information. We address here the challenge of k-anonymizing such messages, focusing on machine generated mail messages that represent more than 90% of today's mail traffic. We introduce a novel message signature Mail-Hash, specifically tailored to identifying structurally-similar messages, which allows us to put such messages in a same equivalence class. We then define a process that generates, for each class, masked mail samples that can be shown to auditors, while guaranteeing the k-anonymity of users. The productivity of auditors is measured by the amount of non-hidden mail content they can see every day, while considering normal working conditions, which set a limit to the number of mail samples they can review. In addition, we consider k-anonymity over time since, by definition of k-anonymity, every new release places additional constraints on the assignment of samples. We describe in details the results we obtained over actual Yahoo mail traffic, and thus demonstrate that our methods are feasible at Web mail scale. Given the constantly growing concern of users over their email being scanned by others, we argue that it is critical to devise such algorithms that guarantee k-anonymity, and implement associated processes in order to restore the trust of mail users.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89134745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Extracting Search Query Patterns via the Pairwise Coupled Topic Model 基于成对耦合主题模型的搜索查询模式提取

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2016-02-08 DOI: 10.1145/2835776.2835794

Takuya Konishi, Takuya Ohwa, Sumio Fujita, K. Ikeda, K. Hayashi

引用次数: 7

Understanding and Identifying Advocates for Political Campaigns on Social Media 理解和识别社会媒体上政治运动的倡导者

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2016-02-08 DOI: 10.1145/2835776.2835807

Suhas Ranganath, Xia Hu, Jiliang Tang, Huan Liu

{"title":"Understanding and Identifying Advocates for Political Campaigns on Social Media","authors":"Suhas Ranganath, Xia Hu, Jiliang Tang, Huan Liu","doi":"10.1145/2835776.2835807","DOIUrl":"https://doi.org/10.1145/2835776.2835807","url":null,"abstract":"Social media is increasingly being used to access and disseminate information on sociopolitical issues like gun rights and general elections. The popularity and openness of social media makes it conducive for some individuals, known as advocates, who use social media to push their agendas on these issues strategically. Identifying these advocates will caution social media users before reading their information and also enable campaign managers to identify advocates for their digital political campaigns. A significant challenge in identifying advocates is that they employ nuanced strategies to shape user opinion and increase the spread of their messages, making it difficult to distinguish them from random users posting on the campaign. In this paper, we draw from social movement theories and design a quantitative framework to study the nuanced message strategies, propagation strategies, and community structure adopted by advocates for political campaigns in social media. Based on observations of their social media activities manifesting from these strategies, we investigate how to model these strategies for identifying them. We evaluate the framework using two datasets from Twitter, and our experiments demonstrate its effectiveness in identifying advocates for political campaigns with ramifications of this work directed towards assisting users as they navigate through social media spaces.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78868141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18