{"title":"Session details: Keynote Address","authors":"Xueqi Cheng","doi":"10.1145/3251090","DOIUrl":"https://doi.org/10.1145/3251090","url":null,"abstract":"","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115169726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging In-Batch Annotation Bias for Crowdsourced Active Learning","authors":"Honglei Zhuang, Joel Young","doi":"10.1145/2684822.2685301","DOIUrl":"https://doi.org/10.1145/2684822.2685301","url":null,"abstract":"Data annotation bias is found in many situations. Often it can be ignored as just another component of the noise floor. However, it is especially prevalent in crowdsourcing tasks and must be actively managed. Annotation bias on single data items has been studied with regard to data difficulty, annotator bias, etc., while annotation bias on batches of multiple data items simultaneously presented to annotators has not been studied. In this paper, we verify the existence of \"in-batch annotation bias\" between data items in the same batch. We propose a factor graph based batch annotation model to quantitatively capture the in-batch annotation bias, and measure the bias during a crowdsourcing annotation process of inappropriate comments in LinkedIn. We discover that annotators tend to make polarized annotations for the entire batch of data items in our task. We further leverage the batch annotation model to propose a novel batch active learning algorithm. We test the algorithm on a real crowdsourcing platform and find that it outperforms in-batch bias naïve algorithms.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"312 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123118666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maria Daltayanni, L. D. Alfaro, Panagiotis Papadimitriou
{"title":"WorkerRank: Using Employer Implicit Judgements to Infer Worker Reputation","authors":"Maria Daltayanni, L. D. Alfaro, Panagiotis Papadimitriou","doi":"10.1145/2684822.2685286","DOIUrl":"https://doi.org/10.1145/2684822.2685286","url":null,"abstract":"In online labor marketplaces two parties are involved; employers and workers. An employer posts a job in the marketplace to receive applications from interested workers. After evaluating the match to the job, the employer hires one (or more workers) to accomplish the job via an online contract. At the end of the contract, the employer can provide his worker with some rating that becomes visible in the worker online profile. This form of explicit feedback guides future hiring decisions, since it is indicative of worker true ability. In this paper, first we discuss some of the shortcomings of the existing reputation systems that are based on the end-of-contract ratings. Then we propose a new reputation mechanism that uses Bayesian updates to combine employer implicit feedback signals in a link-analysis approach. The new system addresses the shortcomings of existing approaches, while yielding better signal for the worker quality towards hiring decision.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125567840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling and Predicting Retweeting Dynamics on Microblogging Platforms","authors":"Shuai Gao, Jun Ma, Zhumin Chen","doi":"10.1145/2684822.2685303","DOIUrl":"https://doi.org/10.1145/2684822.2685303","url":null,"abstract":"Popularity prediction on microblogging platforms aims to predict the future popularity of a message based on its retweeting dynamics in the early stages. Existing works mainly focus on exploring effective features for prediction, while ignoring the underlying arrival process of retweets. Also, the effect of user activity variation on the retweeting dynamics in the early stages has been neglected. In this paper, we propose an extended reinforced Poisson process model with time mapping process to model the retweeting dynamics and predict the future popularity. The proposed model explicitly characterizes the process through which a message gain its retweets, by capturing a power-law temporal relaxation function corresponding to the aging in the ability of the message to attract new retweets and an exponential reinforcement mechanism characterizing the \"richer-get-richer\" phenomenon. Further, we introduce the notation of weibo time and integrate a time mapping process into the proposed model to eliminate the effect of user activity variation. Extensive experiments on two Weibo datasets, with 10K and 18K messages respectively, well demonstrate the effectiveness of our proposed model in popularity prediction.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126834836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Engagement Periodicity in Search Engine Usage: Analysis and its Application to Search Quality Evaluation","authors":"Alexey Drutsa, Gleb Gusev, P. Serdyukov","doi":"10.1145/2684822.2685318","DOIUrl":"https://doi.org/10.1145/2684822.2685318","url":null,"abstract":"Nowadays, billions of people use the Web in connection with their daily needs. A significant part of the needs are constituted by search tasks that are usually addressed by search engines. Thus, daily search needs result in regular user engagement with a search engine. User engagement with web sites and services was studied in various aspects, but there appear to be no studies of its regularity and periodicity. In this paper, we studied periodicity of the user engagement with a popular search engine through applying spectrum analysis to temporal sequences of different engagement metrics. We found periodicity patterns of user engagement and revealed classes of users whose periodicity patterns do not change over a long period of time. In addition, we used the spectrum series as metrics to evaluate search quality.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115053312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiepu Jiang, Ahmed Hassan Awadallah, Xiaolin Shi, Ryen W. White
{"title":"Understanding and Predicting Graded Search Satisfaction","authors":"Jiepu Jiang, Ahmed Hassan Awadallah, Xiaolin Shi, Ryen W. White","doi":"10.1145/2684822.2685319","DOIUrl":"https://doi.org/10.1145/2684822.2685319","url":null,"abstract":"Understanding and estimating satisfaction with search engines is an important aspect of evaluating retrieval performance. Research to date has modeled and predicted search satisfaction on a binary scale, i.e., the searchers are either satisfied or dissatisfied with their search outcome. However, users' search experience is a complex construct and there are different degrees of satisfaction. As such, binary classification of satisfaction may be limiting. To the best of our knowledge, we are the first to study the problem of understanding and predicting graded (multi-level) search satisfaction. We ex-amine sessions mined from search engine logs, where searcher satisfaction was also assessed on multi-point scale by human annotators. Leveraging these search log data, we observe rich and non-monotonous changes in search behavior in sessions with different degrees of satisfaction. The findings suggest that we should predict finer-grained satisfaction levels. To address this issue, we model search satisfaction using features indicating search outcome, search effort, and changes in both outcome and effort during a session. We show that our approach can predict subtle changes in search satisfaction more accurately than state-of-the-art methods, affording greater insight into search satisfaction. The strong performance of our models has implications for search providers seeking to accu-rately measure satisfaction with their services.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"208 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116451885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning About Health and Medicine from Internet Data","authors":"E. Yom-Tov, I. Cox, Vasileios Lampos","doi":"10.1145/2684822.2697042","DOIUrl":"https://doi.org/10.1145/2684822.2697042","url":null,"abstract":"Surveys show that around 70% of US Internet users consult the Internet when they require medical information. People seek this information using both traditional search engines and via social media. The information created using the search process offers an unprecedented opportunity for applications to monitor and improve the quality of life of people with a variety of medical conditions. In recent years, research in this area has addressed public-health questions such as the effect of media on development of anorexia, developed tools for measuring influenza rates and assessing drug safety, and examined the effects of health information on individual wellbeing. This tutorial will show how Internet data can facilitate medical research, providing an overview of the state-of-the-art in this area. During the tutorial we will discuss the information which can be gleaned from a variety of Internet data sources, including social media, search engines, and specialized medical websites. We will provide an overview of analysis methods used in recent literature, and show how results can be evaluated using publicly-available health information and online experimentation. Finally, we will discuss ethical and privacy issues and possible technological solutions. This tutorial is intended for researchers of user generated content who are interested in applying their knowledge to improve health and medicine.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132390676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 5: Practice & Experience Talks","authors":"P. Bennett","doi":"10.1145/3251096","DOIUrl":"https://doi.org/10.1145/3251096","url":null,"abstract":"","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126320707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sarcasm Detection on Twitter: A Behavioral Modeling Approach","authors":"Ashwin Rajadesingan, R. Zafarani, Huan Liu","doi":"10.1145/2684822.2685316","DOIUrl":"https://doi.org/10.1145/2684822.2685316","url":null,"abstract":"Sarcasm is a nuanced form of language in which individuals state the opposite of what is implied. With this intentional ambiguity, sarcasm detection has always been a challenging task, even for humans. Current approaches to automatic sarcasm detection rely primarily on lexical and linguistic cues. This paper aims to address the difficult task of sarcasm detection on Twitter by leveraging behavioral traits intrinsic to users expressing sarcasm. We identify such traits using the user's past tweets. We employ theories from behavioral and psychological studies to construct a behavioral modeling framework tuned for detecting sarcasm. We evaluate our framework and demonstrate its efficiency in identifying sarcastic tweets.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"213 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122816615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Boosting Search with Deep Understanding of Contents and Users","authors":"Kaihua Zhu","doi":"10.1145/2684822.2697047","DOIUrl":"https://doi.org/10.1145/2684822.2697047","url":null,"abstract":"Recent years have witnessed dramatic changes in how people interact with search engines. Search engines are expected to be more intelligent in understanding users' intention and fulfilling users' needs with direct answers rather than raw information. Furthermore, search engines are expected to be equipped with recommendation and dialogue capabilities, making the interaction with users more natural and smoother. In this talk, I will introduce Baidu's work on how to make some of them come true through the deep understanding of users, queries and web pages, and discuss challenges behind these technologies.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115004184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}