Dotan Di Castro, Zohar S. Karnin, L. Lewin-Eytan, Y. Maarek
{"title":"You've got Mail, and Here is What you Could do With It!: Analyzing and Predicting Actions on Email Messages","authors":"Dotan Di Castro, Zohar S. Karnin, L. Lewin-Eytan, Y. Maarek","doi":"10.1145/2835776.2835811","DOIUrl":"https://doi.org/10.1145/2835776.2835811","url":null,"abstract":"With email traffic increasing, leading Web mail services have started to offer features that assist users in reading and processing their inboxes. One approach is to identify \"important\" messages, while a complementary one is to bundle messages, especially machine-generated ones, in pre-defined categories. We rather propose here to go back to the task at hand and consider what actions the users might conduct on received messages. We thoroughly studied, in a privacy-preserving manner, the actions of a large number of users in Yahoo mail, and found out that the most frequent actions are typically read, reply, delete and a sub-type of delete, delete-without-read. We devised a learning framework for predicting these four actions, for users with various levels of activity per action. Our framework leverages both vertical learning for personalization and horizontal learning for regularization purposes. In order to verify the quality of our predictions, we conducted a large-scale experiment involving users who had previously agreed to participate in such research studies. Our results show that, for recall values of 90%, we can predict important actions such as read or reply at precision levels up to 40% for active users, which we consider pretty encouraging for an assistance task. For less active users, we show that our regularization achieves an increase in AUC of close to 50%. To the best of our knowledge, our work is the first to provide a unified framework of this scale for predicting multiple actions on Web email, which hopefully provides a new ground for inventing new user experiences to help users process their inboxes.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74347878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Relational Learning with Social Status Analysis","authors":"Liang Wu, Xia Hu, Huan Liu","doi":"10.1145/2835776.2835782","DOIUrl":"https://doi.org/10.1145/2835776.2835782","url":null,"abstract":"Relational learning has been proposed to cope with the interdependency among linked instances in social network analysis, which often adopts network connectivity and social media content for prediction. A common assumption in existing relational learning methods is that data instances are equally important. The algorithms developed based on the assumption may be significantly affected by outlier data and thus less robust. In the meantime, it has been well established in social sciences that actors are naturally of different social status in a social network. Motivated by findings from social sciences, in this paper, we investigate whether social status analysis could facilitate relational learning. Particularly, we propose a novel framework RESA to model social status using the network structure. It extracts robust and intrinsic latent social dimensions for social actors, which are further exploited as features for supervised learning models. The proposed method is applicable for real-world relational learning problems where noise exists. Extensive experiments are conducted on datasets obtained from real-world social media platforms. Empirical results demonstrate the effectiveness of RESA and further experiments are conducted to help understand the effects of parameter settings to the proposed model and how local social status works.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77975780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Communities and Social Interaction","authors":"B. Aditya Prakash","doi":"10.1145/3253872","DOIUrl":"https://doi.org/10.1145/3253872","url":null,"abstract":"","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"1390 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86493251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mobile App Tagging","authors":"Ning Chen, S. Hoi, Shaohua Li, Xiaokui Xiao","doi":"10.1145/2835776.2835812","DOIUrl":"https://doi.org/10.1145/2835776.2835812","url":null,"abstract":"Mobile app tagging aims to assign a list of keywords indicating core functionalities, main contents, key features or concepts of a mobile app. Mobile app tags can be potentially useful for app ecosystem stakeholders or other parties to improve app search, browsing, categorization, and advertising, etc. However, most mainstream app markets, e.g., Google Play, Apple App Store, etc., currently do not explicitly support such tags for apps. To address this problem, we propose a novel auto mobile app tagging framework for annotating a given mobile app automatically, which is based on a search-based annotation paradigm powered by machine learning techniques. Specifically, given a novel query app without tags, our proposed framework (i) first explores online kernel learning techniques to retrieve a set of top-N similar apps that are semantically most similar to the query app from a large app repository; and (ii) then mines the text data of both the query app and the top-N similar apps to discover the most relevant tags for annotating the query app. To evaluate the efficacy of our proposed framework, we conduct an extensive set of experiments on a large real-world dataset crawled from Google Play. The encouraging results demonstrate that our technique is effective and promising.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89533682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel James Kershaw, Matthew Rowe, Patrick Stacey
{"title":"Towards Modelling Language Innovation Acceptance in Online Social Networks","authors":"Daniel James Kershaw, Matthew Rowe, Patrick Stacey","doi":"10.1145/2835776.2835784","DOIUrl":"https://doi.org/10.1145/2835776.2835784","url":null,"abstract":"Language change and innovation is constant in online and offline communication, and has led to new words entering people's lexicon and even entering modern day dictionaries, with recent additions of 'e-cig' and 'vape'. However the manual work required to identify these 'innovations' is both time consuming and subjective. In this work we demonstrate how such innovations in language can be identified across two different OSN's (Online Social Networks) through the operationalisation of known language acceptance models that incorporate relatively simple statistical tests. From grounding our work in language theory, we identified three statistical tests that can be applied - variation in; frequency, form and meaning. Each show different success rates across the two networks (Geo-bound Twitter sample and a sample of Reddit). These tests were also applied to different community levels within the two networks allowing for different innovations to be identified across different community structures over the two networks, for instance: identifying regional variation across Twitter, and variation across groupings of Subreddits, where identified example innovations included 'casualidad' and 'cym'.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84440479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vanessa Murdock, C. Clarke, J. Kamps, Jussi Karlgren
{"title":"Second Workshop on Search and Exploration of X-Rated Information (SEXI'16): WSDM Workshop Summary","authors":"Vanessa Murdock, C. Clarke, J. Kamps, Jussi Karlgren","doi":"10.1145/2835776.2855118","DOIUrl":"https://doi.org/10.1145/2835776.2855118","url":null,"abstract":"Adult content is pervasive on the web, has been a driving factor in the adoption of the Internet medium, and is responsible for a significant fraction of traffic and revenues, yet rarely attracts attention in research. The research questions surrounding adult content access behaviors are unique, and interesting and valuable research in this area can be done ethically. WSDM 2016 features a half day workshop on Search and Exploration of X-Rated Information (SEXI) for information access tasks related to adult content. While the scope of the workshop remains broad, special attention is devoted to the privacy and security issues surrounding adult content by inviting keynote speakers with extensive experience on these topics. The recent release of the personal data belonging to customers of the adult dating site Ashley Madison provides a timely context for the focus on privacy and security.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75266513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CCCF: Improving Collaborative Filtering via Scalable User-Item Co-Clustering","authors":"Yao Wu, Xudong Liu, Min Xie, M. Ester, Q. Yang","doi":"10.1145/2835776.2835836","DOIUrl":"https://doi.org/10.1145/2835776.2835836","url":null,"abstract":"Collaborative Filtering (CF) is the most popular method for recommender systems. The principal idea of CF is that users might be interested in items that are favorited by similar users, and most of the existing CF methods measure users' preferences by their behaviours over all the items. However, users might have different interests over different topics, thus might share similar preferences with different groups of users over different sets of items. In this paper, we propose a novel and scalable method CCCF which improves the performance of CF methods via user-item co-clustering. CCCF first clusters users and items into several subgroups, where each subgroup includes a set of like-minded users and a set of items in which these users share their interests. Then, traditional CF methods can be easily applied to each subgroup, and the recommendation results from all the subgroups can be easily aggregated. Compared with previous works, CCCF has several advantages including scalability, flexibility, interpretability and extensibility. Experimental results on four real world data sets demonstrate that the proposed method significantly improves the performance of several state-of-the-art recommendation algorithms.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"88 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75888701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Qiu, Jie Tang, T. Liu, Jie Gong, Chenhui Zhang, Qian Zhang, Yufei Xue
{"title":"Modeling and Predicting Learning Behavior in MOOCs","authors":"J. Qiu, Jie Tang, T. Liu, Jie Gong, Chenhui Zhang, Qian Zhang, Yufei Xue","doi":"10.1145/2835776.2835842","DOIUrl":"https://doi.org/10.1145/2835776.2835842","url":null,"abstract":"Massive Open Online Courses (MOOCs), which collect complete records of all student interactions in an online learning environment, offer us an unprecedented opportunity to analyze students' learning behavior at a very fine granularity than ever before. Using dataset from xuetangX, one of the largest MOOCs from China, we analyze key factors that influence students' engagement in MOOCs and study to what extent we could infer a student's learning effectiveness. We observe significant behavioral heterogeneity in students' course selection as well as their learning patterns. For example, students who exert higher effort and ask more questions are not necessarily more likely to get certificates. Additionally, the probability that a student obtains the course certificate increases dramatically (3 x higher) when she has one or more \"certificate friends\". Moreover, we develop a unified model to predict students' learning effectiveness, by incorporating user demographics, forum activities, and learning behavior. We demonstrate that the proposed model significantly outperforms (+2.03-9.03% by F1-score) several alternative methods in predicting students' performance on assignments and course certificates. The model is flexible and can be applied to various settings. For example, we are deploying a new feature into xuetangX to help teachers dynamically optimize the teaching process.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"104 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78532966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yusuke Tanaka, Takeshi Kurashima, Y. Fujiwara, Tomoharu Iwata, H. Sawada
{"title":"Inferring Latent Triggers of Purchases with Consideration of Social Effects and Media Advertisements","authors":"Yusuke Tanaka, Takeshi Kurashima, Y. Fujiwara, Tomoharu Iwata, H. Sawada","doi":"10.1145/2835776.2835789","DOIUrl":"https://doi.org/10.1145/2835776.2835789","url":null,"abstract":"This paper proposes a method for inferring from single-source data the factors that trigger purchases. Here, single-source data are the histories of item purchases and media advertisement views for each individual. We assume a sequence of purchase events to be a stochastic process incorporating the following three factors: (a) user preference, (b) social effects received from other users, and (c) media advertising effects. As our user-purchase model incorporates the latent relationships between users and advertisers, it can infer the latent triggers of purchases. Experiments on real single-source data show that our model can (a) achieve high prediction accuracy for purchases, (b) discover the key information, i.e., popular items, influential users, and influential advertisers, (c) estimate the relative impact of the three factors on purchases, and (d) find user segments according to the estimated factors.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"75 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77292668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katherine Ellis, M. Goldszmidt, Gert R. G. Lanckriet, Nina Mishra, Omer Reingold
{"title":"Equality and Social Mobility in Twitter Discussion Groups","authors":"Katherine Ellis, M. Goldszmidt, Gert R. G. Lanckriet, Nina Mishra, Omer Reingold","doi":"10.1145/2835776.2835814","DOIUrl":"https://doi.org/10.1145/2835776.2835814","url":null,"abstract":"Online groups, including chat groups and forums, are becoming important avenues for gathering and exchanging information ranging from troubleshooting devices, to sharing experiences, to finding medical information and advice. Thus, issues about the health and stability of these groups are of particular interest to both industry and academia. In this paper we conduct a large scale study with the objectives of first, characterizing essential aspects of the interactions between the participants of such groups and second, characterizing how the nature of these interactions relate to the health of the groups. Specifically, we concentrate on Twitter Discussion Groups (TDGs), self-organized groups that meet on Twitter by agreeing on a hashtag, date and time. These groups have repeated, real-time meetings and are a rising phenomenon on Twitter. We examine the interactions in these groups in terms of the social equality and mobility of the exchange of attention between participants, according to the @mention convention on Twitter. We estimate the health of a group by measuring the retention rate of participants and the change in the number of meetings over time. We find that social equality and mobility are correlated, and that equality and mobility are related to a group's health. In fact, equality and mobility are as predictive of a group's health as some prior characteristics used to predict health of other online groups. Our findings are based on studying 100 thousand sessions of over two thousand discussion groups over the period of June 2012 to June 2013. These finding are not only relevant to stakeholders interested in maintaining these groups, but to researchers and academics interested in understanding the behavior of participants in online discussions. We also find the parallel with findings on the relationship between economic mobility and equality and health indicators in real-world nations striking and thought-provoking.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79518160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}