Proceedings of the Eighth ACM International Conference on Web Search and Data Mining最新文献

筛选
英文 中文
Global Optimization for Display Ad 展示广告的全局优化
Rong Ji
{"title":"Global Optimization for Display Ad","authors":"Rong Ji","doi":"10.1145/2684822.2697048","DOIUrl":"https://doi.org/10.1145/2684822.2697048","url":null,"abstract":"Online display advertisement has been examined by numerous studies. Most online display ad systems take the greedy approach, namely they display, for each user, the set of ads that match best with the user's interests. One shortcoming of the greedy approach is that it does not take into account the budget limitation of each advertiser. As a result, we often observed that some ads are popular and match with the interests of millions of users; but due to the budget restriction, these ads can only be presented by a limited times, leading to a suboptimal performance. To make our point clear, let's consider a simple case where we only have two advertisers (i.e. A and B), and two users (i.e. a and b). We assume that both advertisers have only a budget of one display. We further assume that user a is interested in both ads even though he is more interested in ad A, while user b is only interested in ad A. Now, if we take the greedy approach, we will always present ad A to user a; as a result, if user a comes before user b, we will have no appropriate ad to be displayed for user b. On the other hand, if we can take into account the budget limitation of both advertisers, a better approach is to present ad B to user a and ad A to user b. This simple example motivates us to develop the global optimization approach for online display advertisement that explicitly take into account the budget limitation of advertisers when deciding the ad presentation for individual users. The key idea of the proposed approach is to compute a user-ad assignment matrix that maximizes the number of clicks under the constraint of ad budgets from individual advertisers. The main computational challenge is the size of variable to be optimized: since the number of users and advertisements involved in our system are 1 billion and ten thousands, respectively, we need to estimate a matrix of billions times ten thousands. We address this challenge by converting the original optimization problem into its dual problem, in which the number of variables is reduced to only ten thousands. A distributed computing algorithm, based on the Nesterov's method and map-reduce framework, was developed to efficiently solve the related optimization problem. We have observed that, the proposed algorithm significantly improves the effectiveness of ad presentation compared to the greedy algorithm.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"30 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114027375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Session 9: Web Mining (2) 会议详情:第9部分:Web挖掘(2)
F. Silvestri
{"title":"Session details: Session 9: Web Mining (2)","authors":"F. Silvestri","doi":"10.1145/3251101","DOIUrl":"https://doi.org/10.1145/3251101","url":null,"abstract":"","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132297693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Approach to the Problem of Annotation of Research Publications 研究出版物注释问题的探讨
Ekaterina Chernyak
{"title":"An Approach to the Problem of Annotation of Research Publications","authors":"Ekaterina Chernyak","doi":"10.1145/2684822.2697032","DOIUrl":"https://doi.org/10.1145/2684822.2697032","url":null,"abstract":"An approach to multiple labelling research papers is explored. We develop techniques for annotating/labeling research papers in informatics and computer sciences with key phrases taken from the ACM Computing Classification System. The techniques utilize a phrase-to-text relevance measure so that only those phrases that are most relevant go to the annotation. Three phrase-to-text relevance measures are experimentally compared in this setting. The measures are: (a) cosine relevance score between conventional vector space representations of the texts coded with tf-idf weighting; (b) popular characteristic of probability of term generation BM25; and (c) an in-house characteristic of conditional probability of symbols averaged over matching fragments in suffix trees representing texts and phrases, CPAMF. In an experiment conducted over a set of texts published in journals of the ACM and manually annotated by their authors, CPAMF outperforms both the cosine measure and BM25 by a wide margin.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134526324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Exploring the Space of Topic Coherence Measures 主题连贯测度的空间探索
Michael Röder, A. Both, Alexander Hinneburg
{"title":"Exploring the Space of Topic Coherence Measures","authors":"Michael Röder, A. Both, Alexander Hinneburg","doi":"10.1145/2684822.2685324","DOIUrl":"https://doi.org/10.1145/2684822.2685324","url":null,"abstract":"Quantifying the coherence of a set of statements is a long standing problem with many potential applications that has attracted researchers from different sciences. The special case of measuring coherence of topics has been recently studied to remedy the problem that topic models give no guaranty on the interpretablity of their output. Several benchmark datasets were produced that record human judgements of the interpretability of topics. We are the first to propose a framework that allows to construct existing word based coherence measures as well as new ones by combining elementary components. We conduct a systematic search of the space of coherence measures using all publicly available topic relevance data for the evaluation. Our results show that new combinations of components outperform existing measures with respect to correlation to human ratings. nFinally, we outline how our results can be transferred to further applications in the context of text mining, information retrieval and the world wide web.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116627542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1320
WSDM'15 Workshop Summary / Scalable Data Analytics: Theory and Applications WSDM'15研讨会摘要/可伸缩数据分析:理论和应用
Kaizhu Huang, Haiqin Yang, Irwin King, Michael R. Lyu
{"title":"WSDM'15 Workshop Summary / Scalable Data Analytics: Theory and Applications","authors":"Kaizhu Huang, Haiqin Yang, Irwin King, Michael R. Lyu","doi":"10.1145/2684822.2697030","DOIUrl":"https://doi.org/10.1145/2684822.2697030","url":null,"abstract":"The SDA workshop at WSDM 2015 is the fifth International Workshop on Scalable Data Analytics, following the previous four workshops of SDA respectively held at IEEE Big Data 2013, PAKDD 2014, IEEE Big Data 2014, and IEEE ICDM 2014. This series of workshops aims to provide professionals, researchers, and technologists with a single forum where they can discuss and share the state-of-the-art theories and applications of scalable data analytics technologies. In particular, in the era of information explosion, the scientific, biomedical, and engineering research communities are undergoing a profound transformation where discoveries and innovations increasingly rely on massive amounts of data. The characteristics of volume, velocity, variety and veracity originated in the massive big data then bring challenges to current data analytics techniques. The focus of the fifth SDA is to discuss how we can scale up data analytics techniques for modeling and analyzing big data from various domains.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122134945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Session 8: Practice & Experience Talks 会议详情:第八部分:实践与经验讲座
Xuanjing Huang
{"title":"Session details: Session 8: Practice & Experience Talks","authors":"Xuanjing Huang","doi":"10.1145/3251100","DOIUrl":"https://doi.org/10.1145/3251100","url":null,"abstract":"","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129180838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The 2nd workshop on Vertical Search Relevance at WSDM 2015 在WSDM 2015上关于垂直搜索相关性的第二届研讨会
Dawei Yin, Chih-Chieh Hung, Rui Li, Yi Chang
{"title":"The 2nd workshop on Vertical Search Relevance at WSDM 2015","authors":"Dawei Yin, Chih-Chieh Hung, Rui Li, Yi Chang","doi":"10.1145/2684822.2697031","DOIUrl":"https://doi.org/10.1145/2684822.2697031","url":null,"abstract":"As the web information exponentially grows and the needs of users become more specific, traditional general web search engines are not able to perfectly satisfy the nowadays user requirement. Vertical search engines have emerged in various domains, which more focus on specific segments of online content, including local, shopping, medical information, travel search, etc. Vertical search engines start attracting more attention while relevance ranking in different vertical search engines is becoming the key technology. In addition, vertical search results are often slotted into general Web search results. Hence, designing effective ranking functions for vertical search has become practically important to improve users' experience in both web search and vertical search. The workshop bring together researchers from IR, ML, NLP, and other areas of computer and information science, who are working on or interested in this area. It provides a forum for the researchers to identify the issues and the challenges, to share their latest research results, to express a diverse range of opinions about this topic, and to discuss future directions.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115438016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regressing Towards Simpler Prediction Systems 回归到更简单的预测系统
Tushar Chandra
{"title":"Regressing Towards Simpler Prediction Systems","authors":"Tushar Chandra","doi":"10.1145/2684822.2697045","DOIUrl":"https://doi.org/10.1145/2684822.2697045","url":null,"abstract":"This talk will focus on our experience in managing the complexity of Sibyl, a large scale machine learning system that is widely used within Google. We believe that a large fraction of the challenges faced by Sibyl are inherent to large scale production machine learning and that other production systems are likely to encounter them as well [1]. Thus, these challenges present interesting opportunities for future research. The Sibyl system is complex for a number of reasons. We have learnt that a complete end-to-end machine learning solution has to have subsystems to address a variety of different needs: data ingestion, data analysis, data verification, experimentation, model analysis, model serving, configuration, data transformations, support for different kinds of loss functions and modeling, machine learning algorithm implementations, etc. Machine learning algorithms themselves constitute a relatively small fraction of the overall system. Each subsystem consists of a number of distinct components to support the variety of product needs. For example, Sibyl supports more than 5 different model serving systems, each with its own idiosyncrasies and challenges. In addition, Sibyl configuration contains more lines of code than the core Sibyl learner itself. Finally existing solutions for some of the challenges don't feel adequate and we believe these challenges present opportunities for future research. Though the overall system is complex, our users need to be able to deploy solutions quickly. This is because a machine learning deployment is typically an iterative process of model improvements. At each iteration, our users experiment with new features, find those that improve the model's prediction capability, and then \"launch\" a new model with those improved features. A user may go through 10 or more such productive launches. Not only is speed of iteration crucial to our users, but they are often willing to sacrifice the improved prediction quality of a high quality but cumbersome system for the speed of iteration of a lower quality but nimble system. In this talk I will give an example of how simplification drives systems design and sometimes the design of novel algorithms.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127243061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Inferring Movement Trajectories from GPS Snippets 从GPS片段推断运动轨迹
Mu Li, Amr Ahmed, Alex Smola
{"title":"Inferring Movement Trajectories from GPS Snippets","authors":"Mu Li, Amr Ahmed, Alex Smola","doi":"10.1145/2684822.2685313","DOIUrl":"https://doi.org/10.1145/2684822.2685313","url":null,"abstract":"Inferring movement trajectories can be a challenging task, in particular when detailed tracking information is not available due to privacy and data collection constraints. In this paper we present a complete and computationally tractable model for estimating and predicting trajectories based on sparsely sampled, anonymous GPS land-marks that we call GPS snippets. To combat data sparsity we use mapping data as side information to constrain the inference process. We show the efficacy of our approach on a set of prediction tasks over data collected from different cities in the US.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114347096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Real-Time Bidding: A New Frontier of Computational Advertising Research 实时竞价:计算广告研究的新前沿
Jun Wang, Shuai Yuan
{"title":"Real-Time Bidding: A New Frontier of Computational Advertising Research","authors":"Jun Wang, Shuai Yuan","doi":"10.1145/2684822.2697041","DOIUrl":"https://doi.org/10.1145/2684822.2697041","url":null,"abstract":"In display and mobile advertising, the most significant development in recent years is the Real-Time Bidding (RTB), which allows selling and buying in real-time one ad impression at a time. Since then, RTB has fundamentally changed the landscape of the digital marketing by scaling the buying process across a large number of available inventories. The demand for automation, integration and optimisation in RTB brings new research opportunities in the IR/DM/ML fields. However, despite its rapid growth and huge potential, many aspects of RTB remain unknown to the research community for many reasons. In this tutorial, together with invited distinguished speakers from online advertising industry, we aim to bring the insightful knowledge from the real-world systems to bridge the gaps and provide an overview of the fundamental infrastructure, algorithms, and technical and research challenges of this new frontier of computational advertising. We will also introduce to researchers the datasets, tools, and platforms which are publicly available thus they can get hands-on quickly.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116893086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 58
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信