Proceedings of the Ninth ACM International Conference on Web Search and Data Mining最新文献

筛选
英文 中文
Beyond Ranking: Optimizing Whole-Page Presentation 超越排名:优化整个页面的呈现
Yue Wang, Dawei Yin, Luo Jie, Pengyuan Wang, M. Yamada, Yi Chang, Q. Mei
{"title":"Beyond Ranking: Optimizing Whole-Page Presentation","authors":"Yue Wang, Dawei Yin, Luo Jie, Pengyuan Wang, M. Yamada, Yi Chang, Q. Mei","doi":"10.1145/2835776.2835824","DOIUrl":"https://doi.org/10.1145/2835776.2835824","url":null,"abstract":"Modern search engines aggregate results from different verticals: webpages, news, images, video, shopping, knowledge cards, local maps, etc. Unlike \"ten blue links\", these search results are heterogeneous in nature and not even arranged in a list on the page. This revolution directly challenges the conventional \"ranked list\" formulation in ad hoc search. Therefore, finding proper presentation for a gallery of heterogeneous results is critical for modern search engines. We propose a novel framework that learns the optimal page presentation to render heterogeneous results onto search result page (SERP). Page presentation is broadly defined as the strategy to present a set of items on SERP, much more expressive than a ranked list. It can specify item positions, image sizes, text fonts, and any other styles as long as variations are within business and design constraints. The learned presentation is content-aware, i.e. tailored to specific queries and returned results. Simulation experiments show that the framework automatically learns eye-catchy presentations for relevant results. Experiments on real data show that simple instantiations of the framework already outperform leading algorithm in federated search result presentation. It means the framework can learn its own result presentation strategy purely from data, without even knowing the \"probability ranking principle\".","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75074618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Large-Scale Deep Learning For Building Intelligent Computer Systems 构建智能计算机系统的大规模深度学习
J. Dean
{"title":"Large-Scale Deep Learning For Building Intelligent Computer Systems","authors":"J. Dean","doi":"10.1145/2835776.2835844","DOIUrl":"https://doi.org/10.1145/2835776.2835844","url":null,"abstract":"For the past five years, the Google Brain team has focused on conducting research in difficult problems in artificial intelligence, on building large-scale computer systems for machine learning research, and, in collaboration with many teams at Google, on applying our research and systems to dozens of Google products. Our group has recently open-sourced the TensorFlow system (tensorflow.org), a system designed to easily express machine ideas, and to quickly train, evaluate and deploy machine learning systems. In this talk, I'll highlight some of the design decisions we made in building TensorFlow, discuss research results produced within our group, and describe ways in which these ideas have been applied to a variety of problems in Google's products, usually in close collaboration with other teams. This talk describes joint work with many people at Google.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"325 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77587030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
On Obtaining Effort Based Judgements for Information Retrieval 基于努力的信息检索判断方法研究
Manisha Verma, Emine Yilmaz, Nick Craswell
{"title":"On Obtaining Effort Based Judgements for Information Retrieval","authors":"Manisha Verma, Emine Yilmaz, Nick Craswell","doi":"10.1145/2835776.2835840","DOIUrl":"https://doi.org/10.1145/2835776.2835840","url":null,"abstract":"Document relevance has been the primary focus in the design, optimization and evaluation of retrieval systems. Traditional testcollections are constructed by asking judges the relevance grade for a document with respect to an input query. Recent work of Yilmaz et al. found an evidence that effort is another important factor in determining document utility, suggesting that more thought should be given into incorporating effort into information retrieval. However, that work did not ask judges to directly assess the level of effort required to consume a document or analyse how effort judgements relate to traditional relevance judgements. In this work, focusing on three aspects associated with effort, we show that it is possible to get judgements of effort from the assessors. We further show that given documents of the same relevance grade, effort needed to find the portion of the document relevant to the query is a significant factor in determining user satisfaction as well as user preference between these documents. Our results suggest that if the end goal is to build retrieval systems that optimize user satisfaction, effort should be included as an additional factor to relevance in building and evaluating retrieval systems. We further show that new retrieval features are needed if the goal is to build retrieval systems that jointly optimize relevance and effort and propose a set of such features. Finally, we focus on the evaluation of retrieval systems and show that incorporating effort into retrieval evaluation could lead to significant differences regarding the performance of retrieval systems.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78859454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Who Will Reply to/Retweet This Tweet?: The Dynamics of Intimacy from Online Social Interactions 谁会回复/转发这条推文?:来自网络社交互动的亲密动态
Nicholas Jing Yuan, Yuan Zhong, Fuzheng Zhang, Xing Xie, Chin-Yew Lin, Y. Rui
{"title":"Who Will Reply to/Retweet This Tweet?: The Dynamics of Intimacy from Online Social Interactions","authors":"Nicholas Jing Yuan, Yuan Zhong, Fuzheng Zhang, Xing Xie, Chin-Yew Lin, Y. Rui","doi":"10.1145/2835776.2835800","DOIUrl":"https://doi.org/10.1145/2835776.2835800","url":null,"abstract":"Friendships are dynamic. Previous studies have converged to suggest that social interactions, in both online and offline social networks, are diagnostic reflections of friendship relations (also called social ties). However, most existing approaches consider a social tie as either a binary relation, or a fixed value (named tie strength). In this paper, we investigate the dynamics of dyadic friend relationships through online social interactions, in terms of a variety of aspects, such as reciprocity, temporality, and contextuality. In turn, we propose a model to predict repliers and retweeters given a particular tweet posted at a certain time in a microblog-based social network. More specifically, we have devised a learning-to-rank approach to train a ranker that considers elaborate user-level and tweet-level features (like sentiment, self-disclosure, and responsiveness) to address these dynamics. In the prediction phase, a tweet posted by a user is deemed a query and the predicted repliers/retweeters are retrieved using the learned ranker. We have collected a large dataset containing 73.3 million dyadic relationships with their interactions (replies and retweets). Extensive experimental results based on this dataset show that by incorporating the dynamics of friendship relations, our approach significantly outperforms state-of-the-art models in terms of multiple evaluation metrics, such as MAP, NDCG and Topmost Accuracy. In particular, the advantage of our model is even more promising in predicting the exact sequence of repliers/retweeters considering their orders. Furthermore, the proposed approach provides emerging implications for many high-value applications in online social networks.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"157 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87684015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Serving a Billion Personalized News Feeds 为10亿个个性化新闻源提供服务
L. Backstrom
{"title":"Serving a Billion Personalized News Feeds","authors":"L. Backstrom","doi":"10.1145/2835776.2835848","DOIUrl":"https://doi.org/10.1145/2835776.2835848","url":null,"abstract":"Feed ranking's goal is to provide perople with over a billion personalized experiences. We strive to provide the most compelling content to each person, personalized to them so that they are most likely to see the content that is most interesting to them. Similar to a newspaper, putting the right stories above the fold has always been critical to engaging customers and interesting them in the rest of the paper. In feed ranking, we face a similar challenge, but on a grander scale. Each time a person visits, we need to find the best piece of content out of all the available stories and put it at the top of feed where people are most likely to see it. To accomplish this, we do large-scale machine learning to model each person, figure out which friends, pages and topics they care about and pick the stories each particular person is interested in. In addition to the large-scale machine learning problems we work on, another primary area of research is understanding the value we are creating for people and making sure that our objective function is in alignment with what people want.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88234798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Session details: Practice & Experience Track 会议详情:实践与经验专场
Marc Najork
{"title":"Session details: Practice & Experience Track","authors":"Marc Najork","doi":"10.1145/3253878","DOIUrl":"https://doi.org/10.1145/3253878","url":null,"abstract":"","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"79 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86201444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
To Suggest, or Not to Suggest for Queries with Diverse Intents: Optimizing Search Result Presentation 建议,或不建议的查询具有不同的意图:优化搜索结果的表现
Makoto P. Kato, Katsumi Tanaka
{"title":"To Suggest, or Not to Suggest for Queries with Diverse Intents: Optimizing Search Result Presentation","authors":"Makoto P. Kato, Katsumi Tanaka","doi":"10.1145/2835776.2835805","DOIUrl":"https://doi.org/10.1145/2835776.2835805","url":null,"abstract":"We propose a method of optimizing search result presentation for queries with diverse intents, by selectively presenting query suggestions for leading users to more relevant search results. The optimization is based on a probabilistic model of users who click on query suggestions in accordance with their intents, and modified versions of intent-aware evaluation metrics that take into account the co-occurrence between intents. Showing many query suggestions simply increases a chance to satisfy users with diverse intents in this model, while it in fact requires users to spend additional time for scanning and selecting suggestions, and may result in low satisfaction for some users. Therefore, we measured the loss of time caused by query suggestion presentation by conducting a user study in different settings, and included its negative effects in our optimization problem. Our experiments revealed that the optimization of search result presentation significantly improved that of a single ranked list, and was beneficial especially for patient users. Moreover, experimental results showed that our optimization was effective particularly when intents of a query often co-occur with a small subset of intents.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89198499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Semantic Documents Relatedness using Concept Graph Representation 使用概念图表示的语义文档相关性
Y. Ni, Qiongkai Xu, Feng Cao, Y. Mass, D. Sheinwald, Hui Zhu, Shao Sheng Cao
{"title":"Semantic Documents Relatedness using Concept Graph Representation","authors":"Y. Ni, Qiongkai Xu, Feng Cao, Y. Mass, D. Sheinwald, Hui Zhu, Shao Sheng Cao","doi":"10.1145/2835776.2835801","DOIUrl":"https://doi.org/10.1145/2835776.2835801","url":null,"abstract":"We deal with the problem of document representation for the task of measuring semantic relatedness between documents. A document is represented as a compact concept graph where nodes represent concepts extracted from the document through references to entities in a knowledge base such as DBpedia. Edges represent the semantic and structural relationships among the concepts. Several methods are presented to measure the strength of those relationships. Concepts are weighted through the concept graph using closeness centrality measure which reflects their relevance to the aspects of the document. A novel similarity measure between two concept graphs is presented. The similarity measure first represents concepts as continuous vectors by means of neural networks. Second, the continuous vectors are used to accumulate pairwise similarity between pairs of concepts while considering their assigned weights. We evaluate our method on a standard benchmark for document similarity. Our method outperforms state-of-the-art methods including ESA (Explicit Semantic Annotation) while our concept graphs are much smaller than the concept vectors generated by ESA. Moreover, we show that by combining our concept graph with ESA, we obtain an even further improvement.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91007994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Is Mail The Next Frontier In Search And Data Mining? 邮件是搜索和数据挖掘的下一个前沿领域吗?
Y. Maarek
{"title":"Is Mail The Next Frontier In Search And Data Mining?","authors":"Y. Maarek","doi":"10.1145/2835776.2835847","DOIUrl":"https://doi.org/10.1145/2835776.2835847","url":null,"abstract":"The nature of Web mail traffic has significantly evolved in the last two decades, and consequently the behavior of Web mail users has also changed. For instance a recent study conducted by Yahoo Labs showed that today 90% of Web mail traffic is machine-generated. This partly explains why email traffic continues to grow even if a significant amount of personal communications has moved towards social media. Most users today are receiving in their inbox important invoices, receipts, and travel itineraries, together with non-malicious junk mail such as hotel newsletters or shopping promotions that could safely ignore. This is one of the reasons that a majority of messages remain unread, and many are deleted without being read. In that sense, Web mail has become quite similar to traditional snail mail. In spite of this drastic change in nature, many mail features remain unchanged. While 70% of mail users do not define even a single folder, folders are still predominant in the left trail of many Web mail clients. Mail search results are still mostly ranked by date, which makes the retrieving of older messages extremely challenging. This is even more painful to users, as unlike in Web search, they will know when a relevant previously read message has not been returned. In this talk, I present the results of multiple large-scale studies that have been conducted at Yahoo Labs in the last few years. I highlight the inherent challenges associated with such studies, especially around privacy concerns. I will discuss the new nature of consumer Web mail, which is dominated by machine-generated messages of highly heterogeneous forms and value. I will show how the change has not been fully recognized yet by my most email clients. As an example, why should there still be a reply option associated with a message coming from a \"do-not-reply@\" address?. I will introduce some approaches for large-scale mail mining specifically tailored to machine-generated email. I will conclude by discussing possible applications and research directions.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"69 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89941659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Web-scale Multimedia Search for Internet Video Content 互联网视频内容的网络规模多媒体搜索
Lu Jiang
{"title":"Web-scale Multimedia Search for Internet Video Content","authors":"Lu Jiang","doi":"10.1145/2835776.2855081","DOIUrl":"https://doi.org/10.1145/2835776.2855081","url":null,"abstract":"The Internet has been witnessing an explosion of video content. According to a Cisco study, video content is estimated to account for 80% of all the entire world's internet traffic by 2019. Video data are becoming one of the most valuable sources to assess information and knowledge. However, existing video search solutions are still based on text matching (text-to-text search), and could fail for the huge volumes of videos that have little relevant metadata or no metadata at all. The need for large-scale and intelligent video search, which bridges the gap between the user's information need and the video content, seems to be urgent. In this thesis, we propose an accurate, efficient and scalable search method for video content. As opposed to text matching, the proposed method relies on automatic video content understanding, and allows for intelligent and flexible search paradigms over the video content, including text-to-video and text&video-to-video search. Suppose our goal is to search the videos about birthday party. In traditional text-to-text queries, we have to search the keywords in the user-generated metadata (titles or descriptions). In a text-to-video query, however, we might look for visual clues in the video content such as \"cake\", \"gift\" and \"kids\", audio clues like \"birthday song\" and \"cheering sound\", or visible text like \"happy birthday\". Text-to-video queries are flexible and can be further refined by Boolean and temporal operators. After watching the retrieved videos, the user may select a few interesting videos to find more relevant videos like these. This can be achieved by issuing a text&video-to-video query which adds the selected video examples to the query. The proposed method provides a new dimension of looking at content-based video search, from finding a simple concept like \"puppy\" to searching a complex incident like \"a scene in urban area where people running away after an explosion\". To achieve this ambitious goal, we propose several novel methods focusing on accuracy, efficiency and scalability in the novel search paradigm. First, we introduce a novel self-paced curriculum learning theory that allows for training more accurate semantic concepts. Second, we propose a novel and scalable approach to index semantic concepts that can significantly improve the search efficiency with minimum accuracy loss. Third, we design a novel video reranking algorithm that can boost accuracy for video retrieval. The extensive experiments demonstrate that the proposed methods are able to surpass state-of-the-art accuracy on multiple datasets. In addition, our method can efficiently scale up the search to hundreds of millions videos, and only takes about 0.2 second to search a semantic query on a collection of 100 million videos, 1 second to process a hybrid query over 1 million videos. Based on the proposed methods, we implement E-Lamp Lite, the first of its kind large-scale semantic search engine for Internet videos. According to Natio","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89400407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信