Proceedings of the Ninth ACM International Conference on Web Search and Data Mining最新文献

筛选
英文 中文
Scaling up Link Prediction with Ensembles 利用集成扩展链路预测
Liang Duan, C. Aggarwal, Shuai Ma, Renjun Hu, J. Huai
{"title":"Scaling up Link Prediction with Ensembles","authors":"Liang Duan, C. Aggarwal, Shuai Ma, Renjun Hu, J. Huai","doi":"10.1145/2835776.2835815","DOIUrl":"https://doi.org/10.1145/2835776.2835815","url":null,"abstract":"A network with $n$ nodes contains O(n2) possible links. Even for networks of modest size, it is often difficult to evaluate all pairwise possibilities for links in a meaningful way. Furthermore, even though link prediction is closely related to missing value estimation problems, such as collaborative filtering, it is often difficult to use sophisticated models such as latent factor methods because of their computational complexity over very large networks. Due to this computational complexity, most known link prediction methods are designed for evaluating the link propensity over a specified subset of links, rather than for performing a global search over the entire networks. In practice, however, it is essential to perform an exhaustive search over the entire networks. In this paper, we propose an ensemble enabled approach to scaling up link prediction, which is able to decompose traditional link prediction problems into subproblems of smaller size. These subproblems are each solved with the use of latent factor models, which can be effectively implemented over networks of modest size. Furthermore, the ensemble enabled approach has several advantages in terms of performance. We show the advantage of using ensemble-based latent factor models with experiments on very large networks. Experimental results demonstrate the effectiveness and scalability of our approach.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"56 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81437989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
WSDM 2016 Workshop on the Ethics of Online Experimentation WSDM 2016在线实验伦理研讨会
Fernando Diaz, Solon Barocas
{"title":"WSDM 2016 Workshop on the Ethics of Online Experimentation","authors":"Fernando Diaz, Solon Barocas","doi":"10.1145/2835776.2855117","DOIUrl":"https://doi.org/10.1145/2835776.2855117","url":null,"abstract":"Online experimentation is now a core and near-constant part of the operation of a production online service, such as a web search engine or social media service. These are large-scale experiments that involve research subjects often numbering in the hundreds of thousands and wide-ranging, computer-automated variations in experimental treatment. In some cases, the results of online experiments may be of use internally to optimize system performance (for example, a test may be conducted to help make web page layout decisions). In other cases, the results may be of academic interest (for example, an experiment may be conducted to test a hypothesis about human behavior). Because of their rapid deployment and broad impact, online experimentation systems provide an extremely valuable tool for scientists and engineers. Despite this statistical power, in some situations, an online experiment can raise difficult ethical questions. One only needs to revisit the conversations resulting from the Facebook emotional contagion experiment to understand that some experiments may, at the very least, warrant careful review before being conducted. Since this episode, scholarship published mainly in the qualitative research and information law communities indicates that this may not be an isolated incident. Ethical and legal problems probably arise in other online experiments, published or not. As experimentation platforms and users become easily accessible, scientists and practitioners may increasingly put the well-being and trust of end users at risk. In light of these concerns, organizations often review online experiments before they are actually conducted. In production settings, the review process might vary with respect to formality or standards across companies and even groups within companies. When intended or used for academic publication, experiments or data may have undergone inconsistent review processes, some implementing academic-style institutional review boards and others none at all. Although there is a suggestion that service providers are concerned about the wellbeing of end users, the community does not","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83421331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Session details: Practice & Experience Track 会议详情:实践与经验专场
Brian D. Davison
{"title":"Session details: Practice & Experience Track","authors":"Brian D. Davison","doi":"10.1145/3253881","DOIUrl":"https://doi.org/10.1145/3253881","url":null,"abstract":"","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"113 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80618494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Efficiency of the Information Networks in Social Media 论社交媒体中信息网络的效率
Mahmoudreza Babaei, Przemyslaw A. Grabowicz, I. Valera, K. Gummadi, M. Gomez-Rodriguez
{"title":"On the Efficiency of the Information Networks in Social Media","authors":"Mahmoudreza Babaei, Przemyslaw A. Grabowicz, I. Valera, K. Gummadi, M. Gomez-Rodriguez","doi":"10.1145/2835776.2835826","DOIUrl":"https://doi.org/10.1145/2835776.2835826","url":null,"abstract":"Social media sites are information marketplaces, where users produce and consume a wide variety of information and ideas. In these sites, users typically choose their information sources, which in turn determine what specific information they receive, how much information they receive and how quickly this information is shown to them. In this context, a natural question that arises is how efficient are social media users at selecting their information sources. In this work, we propose a computational framework to quantify users' efficiency at selecting information sources. Our framework is based on the assumption that the goal of users is to acquire a set of unique pieces of information. To quantify user's efficiency, we ask if the user could have acquired the same pieces of information from another set of sources more efficiently. We define three different notions of efficiency -- link, in-flow, and delay -- corresponding to the number of sources the user follows, the amount of (redundant) information she acquires and the delay with which she receives the information. Our definitions of efficiency are general and applicable to any social media system with an underlying in- formation network, in which every user follows others to receive the information they produce. In our experiments, we measure the efficiency of Twitter users at acquiring different types of information. We find that Twitter users exhibit sub-optimal efficiency across the three notions of efficiency, although they tend to be more efficient at acquiring non- popular pieces of information than they are at acquiring popular pieces of information. We then show that this lack of efficiency is a consequence of the triadic closure mechanism by which users typically discover and follow other users in social media. Thus, our study reveals a tradeoff between the efficiency and discoverability of information sources. Finally, we develop a heuristic algorithm that enables users to be significantly more efficient at acquiring the same unique pieces of information.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"183 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83032681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Publication Date Prediction through Reverse Engineering of the Web 通过Web逆向工程预测出版日期
L. Ostroumova, P. Prokhorenkov, E. Samosvat, P. Serdyukov
{"title":"Publication Date Prediction through Reverse Engineering of the Web","authors":"L. Ostroumova, P. Prokhorenkov, E. Samosvat, P. Serdyukov","doi":"10.1145/2835776.2835796","DOIUrl":"https://doi.org/10.1145/2835776.2835796","url":null,"abstract":"In this paper, we focus on one of the most challenging tasks in temporal information retrieval: detection of a web page publication date. The natural approach to this problem is to find the publication date in the HTML body of a page. However, there are two fundamental problems with this approach. First, not all web pages contain the publication dates in their texts. Second, it is hard to distinguish the publication date among all the dates found in the page's text. The approach we suggest in this paper supplements methods of date extraction from the page's text with novel link-based methods of dating. Some of our link-based methods are based on a probabilistic model of the Web graph structure evolution, which relies on the publication dates of web pages as on its parameters. We use this model to estimate the publication dates of web pages: based only on the link structure currently observed, we perform a ``reverse engineering'' to reveal the whole process of the Web's evolution.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91028562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Feedback Control of Real-Time Display Advertising 实时展示广告的反馈控制
Weinan Zhang, Yifei Rong, Jun Wang, Tianchi Zhu, Xiaofan Wang
{"title":"Feedback Control of Real-Time Display Advertising","authors":"Weinan Zhang, Yifei Rong, Jun Wang, Tianchi Zhu, Xiaofan Wang","doi":"10.1145/2835776.2835843","DOIUrl":"https://doi.org/10.1145/2835776.2835843","url":null,"abstract":"Real-Time Bidding (RTB) is revolutionising display advertising by facilitating per-impression auctions to buy ad impressions as they are being generated. Being able to use impression-level data, such as user cookies, encourages user behaviour targeting, and hence has significantly improved the effectiveness of ad campaigns. However, a fundamental drawback of RTB is its instability because the bid decision is made per impression and there are enormous fluctuations in campaigns' key performance indicators (KPIs). As such, advertisers face great difficulty in controlling their campaign performance against the associated costs. In this paper, we propose a feedback control mechanism for RTB which helps advertisers dynamically adjust the bids to effectively control the KPIs, e.g., the auction winning ratio and the effective cost per click. We further formulate an optimisation framework to show that the proposed feedback control mechanism also has the ability of optimising campaign performance. By settling the effective cost per click at an optimal reference value, the number of campaign's ad clicks can be maximised with the budget constraint. Our empirical study based on real-world data verifies the effectiveness and robustness of our RTB control system in various situations. The proposed feedback control mechanism has also been deployed on a commercial RTB platform and the online test has shown its success in generating controllable advertising performance.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"86 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90303096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Representation Learning for Information Diffusion through Social Networks: an Embedded Cascade Model 社会网络中信息扩散的表征学习:一个嵌入式级联模型
Simon Bourigault, S. Lamprier, P. Gallinari
{"title":"Representation Learning for Information Diffusion through Social Networks: an Embedded Cascade Model","authors":"Simon Bourigault, S. Lamprier, P. Gallinari","doi":"10.1145/2835776.2835817","DOIUrl":"https://doi.org/10.1145/2835776.2835817","url":null,"abstract":"In this paper, we focus on information diffusion through social networks. Based on the well-known Independent Cascade model, we embed users of the social network in a latent space to extract more robust diffusion probabilities than those defined by classical graphical learning approaches. Better generalization abilities provided by the use of such a projection space allows our approach to present good performances on various real-world datasets, for both diffusion prediction and influence relationships inference tasks. Additionally, the use of a projection space enables our model to deal with larger social networks.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86742203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 131
Detecting Social Media Icebergs by Their Tips: Rumors, Persuasion Campaigns, and Information Needs 通过他们的提示检测社交媒体冰山:谣言,说服活动和信息需求
Zhe Zhao
{"title":"Detecting Social Media Icebergs by Their Tips: Rumors, Persuasion Campaigns, and Information Needs","authors":"Zhe Zhao","doi":"10.1145/2835776.2855086","DOIUrl":"https://doi.org/10.1145/2835776.2855086","url":null,"abstract":"Online activities of more than one billion social media users all over the world form a resourceful ocean of data. Many social media mining techniques try to explore this ocean and extract different types of resources. In this thesis, we present a framework that can detect different types of meaningful social media phenomena. They usually can be viewed as a group of online activities from many social media users with a common or similar objective, such as spreading of rumors, bursting information needs on events and products, or asking for support of an action. These different types of social media phenomena are relatively rare but can be very influential. Detecting them is challenging according to its characteristics. Each phenomenon contains a collection of activities that usually take variety of forms. Taking the spreading of rumor in social media as an example, one rumor may be spread in different forms of statements and expressions. And it can be very hard to distinguish them from statements from trustful sources. Existing work of detecting different types of social media phenomena usually adopts classifiers trained on features of a single activity or cluster of activities [1]. However, the features from single activity are not sufficient for many detection tasks. And the features from cluster of activities will not be significant until that cluster becomes large enough, which cannot be used in early stage detection . In this thesis, we propose to detect meaningful social media phenomena by signal user behaviors observed at an early stage. Just like spotting icebergs in the ocean by their tips, in our case, the tip of a social media iceberg is a small proportion of activities that exist only in social media icebergs. And they can be found even at the early stage. Therefore, we design our detection framework to first detect these specific signal activities. Then we will use them to understand the characteristic of the entire collection of activities from social media phenomena . What we learned can be used to train accurate classifiers to identify whether a collection of activities containing signal activities is a target social media phenomenon or not. This framework is generic and can be applied on detecting many different types of collective activities in social media. We apply our framework on detecting three types of meaningful soPermission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). WSDM 2016 February 22-25, 2016, San Francisco, CA, USA c © 2016 Copyright held by the owner/author(s). ACM ISBN 978-1-4503-3716-8/16/02. DOI: http://dx.doi.org/10.1145/2835776.2855086 cial media phenomena, i.e., emerging ","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"83 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81427813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transductive Classification on Heterogeneous Information Networks with Edge Betweenness-based Normalization 基于边缘间距归一化的异构信息网络转换分类
P. Bangcharoensap, T. Murata, Hayato Kobayashi, N. Shimizu
{"title":"Transductive Classification on Heterogeneous Information Networks with Edge Betweenness-based Normalization","authors":"P. Bangcharoensap, T. Murata, Hayato Kobayashi, N. Shimizu","doi":"10.1145/2835776.2835799","DOIUrl":"https://doi.org/10.1145/2835776.2835799","url":null,"abstract":"This paper proposes a novel method for transductive classification on heterogeneous information networks composed of multiple types of vertices. Such networks naturally represent many real-world Web data such as DBLP data (author, paper, and conference). Given a network where some vertices are labeled, the classifier aims to predict labels for the remaining vertices by propagating the labels to the entire network. In the label propagation process, many studies reduce the importance of edges connecting to a high-degree vertex. The assumption is unsatisfactory when reliability of a label of a vertex cannot be implied from its degree. On the basis of our intuition that edges bridging across communities are less trustworthy, we adapt edge betweenness to imply the importance of edges. Since directly applying the conventional edge betweenness is inefficient on heterogeneous networks, we propose two additional refinements. First, the centrality utilizes the fact that networks contain multiple types of vertices. Second, the centrality ignores flows originating from endpoints of considering edges. The experimental results on real-world datasets show our proposed method is more effective than a state-of-the-art method, GNetMine. On average, our method yields 92.79 ± 1.25% accuracy on a DBLP network even if only 1.92% of vertices are labeled. Our simple weighting scheme results in more than 5 percentage points increase in accuracy compared with GNetMine.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82550261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Optimizing Search Interactions within Professional Social Networks 在专业社交网络中优化搜索交互
N. Spirin
{"title":"Optimizing Search Interactions within Professional Social Networks","authors":"N. Spirin","doi":"10.1145/2835776.2855092","DOIUrl":"https://doi.org/10.1145/2835776.2855092","url":null,"abstract":"To help users cope with the scale and influx of new information, professional social networks (PSNs) provide a search functionality. However, most of the search engines within PSNs today only support keyword queries and basic faceted search capabilities overlooking serendipitous network exploration and search for relationships between entities. This results in siloed information and a limited search space. My thesis is that we must redesign all major elements of a search user interface, such as input, control, and informational, to enable more effective search interactions within PSNs. I will introduce new insights and algorithms supporting the thesis.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85302159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信