Proceedings of the sixth ACM international conference on Web search and data mining最新文献

筛选
英文 中文
Learning to rank for spatiotemporal search 学习对时空搜索进行排序
B. Shaw, Jon Shea, Siddhartha Sinha, A. Hogue
{"title":"Learning to rank for spatiotemporal search","authors":"B. Shaw, Jon Shea, Siddhartha Sinha, A. Hogue","doi":"10.1145/2433396.2433485","DOIUrl":"https://doi.org/10.1145/2433396.2433485","url":null,"abstract":"In this article we consider the problem of mapping a noisy estimate of a user's current location to a semantically meaningful point of interest, such as a home, restaurant, or store. Despite the poor accuracy of GPS on current mobile devices and the relatively high density of places in urban areas, it is possible to predict a user's location with considerable precision by explicitly modeling both places and users and by combining a variety of signals about a user's current context. Places are often simply modeled as a single latitude and longitude when in fact they are complex entities existing in both space and time and shaped by the millions of people that interact with them. Similarly, models of users reveal complex but predictable patterns of mobility that can be exploited for this task. We propose a novel spatial search algorithm that infers a user's location by combining aggregate signals mined from billions of foursquare check-ins with real-time contextual information. We evaluate a variety of techniques and demonstrate that machine learning algorithms for ranking and spatiotemporal models of places and users offer significant improvement over common methods for location search based on distance and popularity.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114672605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 97
Data-driven political science 数据驱动的政治学
Ingmar Weber, Ana-Maria Popescu, M. Pennacchiotti
{"title":"Data-driven political science","authors":"Ingmar Weber, Ana-Maria Popescu, M. Pennacchiotti","doi":"10.1145/2433396.2433498","DOIUrl":"https://doi.org/10.1145/2433396.2433498","url":null,"abstract":"The tutorial will summarize the state-of-the art in the growing area of computational political science. Like many others, this research domain is being revolutionized by the availability of open, big data and the increasing reach and importance of social media. The surging interest on the part of the academic community is matched by intense efforts on the part of political campaigns to use online data in order to learn how to best disseminate information and reach the right potential donors or voters. In this context, a tutorial can summarize existing methods in a fascinating, high-interest area and allow participants with diverse backgrounds to get inspiration from the methods and problems studied. The tutorial will feature seminal research concerning (i) political polarization, (ii) election prediction and polling, and (iii) political campaigning and influence propagation. The goal is not only to familiarize attendees with ideas from related conferences such as WWW, ICWSM or CIKM, but also to present ideas and quantitative methods closer to political science such as Poole's and Rosenthal's NOMINATE score for a politician's political orientation.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125121647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Connecting comments and tags: improved modeling of social tagging systems 连接评论和标签:改进的社会标签系统建模
Dawei Yin, Shengbo Guo, Boris Chidlovskii, Brian D. Davison, C. Archambeau, Guillaume Bouchard
{"title":"Connecting comments and tags: improved modeling of social tagging systems","authors":"Dawei Yin, Shengbo Guo, Boris Chidlovskii, Brian D. Davison, C. Archambeau, Guillaume Bouchard","doi":"10.1145/2433396.2433466","DOIUrl":"https://doi.org/10.1145/2433396.2433466","url":null,"abstract":"Collaborative tagging systems are now deployed extensively to help users share and organize resources. Tag prediction and recommendation can simplify and streamline the user experience, and by modeling user preferences, predictive accuracy can be significantly improved. However, previous methods typically model user behavior based only on a log of prior tags, neglecting other behaviors and information in social tagging systems, e.g., commenting on items and connecting with other users. On the other hand, little is known about the connection and correlations among these behaviors and contexts in social tagging systems. In this paper, we investigate improved modeling for predictive social tagging systems. Our explanatory analyses demonstrate three significant challenges: coupled high order interaction, data sparsity and cold start on items. We tackle these problems by using a generalized latent factor model and fully Bayesian treatment. To evaluate performance, we test on two real-world data sets from Flickr and Bibsonomy. Our experiments on these data sets show that to achieve best predictive performance, it is necessary to employ a fully Bayesian treatment in modeling high order relations in social tagging system. Our methods noticeably outperform state-of-the-art approaches.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129442210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Advanced graph mining for community evaluation in social networks and the web 社会网络和web中用于社区评估的高级图挖掘
C. Giatsidis, Fragkiskos D. Malliaros, M. Vazirgiannis
{"title":"Advanced graph mining for community evaluation in social networks and the web","authors":"C. Giatsidis, Fragkiskos D. Malliaros, M. Vazirgiannis","doi":"10.1145/2433396.2433495","DOIUrl":"https://doi.org/10.1145/2433396.2433495","url":null,"abstract":"Graphs constitute a dominant data structure and appear essentially in all forms of information. Examples are the Web graph, numerous social networks, protein interaction networks, terms dependency graphs and network topologies. The main features of these graphs are their huge volume and rate of change. Presumably, there is important hidden knowledge in the macroscopic topology and features of these graphs. A cornerstone issue here is the detection and evaluation of communities -- bearing multiple and diverse semantics. The tutorial reports the basic models of graph structures for undirected, directed and signed graphs and their properties. Next we offer a thorough review of fundamental methods for graph clustering and community detection, on both undirected and directed graphs. Then we survey community evaluation measures, including both the individual node based ones as well as those that take into account aggregate properties of communities. A special mention is made on approaches that capitalize on the concept of degeneracy (k-cores and extensions), as a novel means of community detection and evaluation. We justify the above foundational framework with applications on citation graphs, trust networks and protein graphs.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123635546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Workshop on semantic personalized information management (SPIM'13) 语义个性化信息管理研讨会(SPIM'13)
Till Plumbaum, E. W. D. Luca, Aldo Gangemi, M. Hausenblas
{"title":"Workshop on semantic personalized information management (SPIM'13)","authors":"Till Plumbaum, E. W. D. Luca, Aldo Gangemi, M. Hausenblas","doi":"10.1145/2433396.2433506","DOIUrl":"https://doi.org/10.1145/2433396.2433506","url":null,"abstract":"The SPIM workshop focuses especially on people that are working on the social or semantic Web, machine learning, user modeling, recommender systems, information retrieval, semantic interaction, or their combination. The goal is to bring together researchers and practitioners to initiating discussions on the different requirements and challenges coming with the social and semantic Web for personalized information retrieval systems. The workshop aims at improving the exchange of ideas between the different research communities and practitioners involved in the research on semantic personalized information management.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124511400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Retweet or not?: personalized tweet re-ranking 是否转发?:个性化的tweet重新排名
W. Feng, Jianyong Wang
{"title":"Retweet or not?: personalized tweet re-ranking","authors":"W. Feng, Jianyong Wang","doi":"10.1145/2433396.2433470","DOIUrl":"https://doi.org/10.1145/2433396.2433470","url":null,"abstract":"With Twitter being widely used around the world, users are facing enormous new tweets every day. Tweets are ranked in chronological order regardless of their potential interestedness. Users have to scan through pages of tweets to find useful information. Thus more personalized ranking scheme is needed to filter the overwhelmed information. Since retweet history reveals users' personal preference for tweets, we study how to learn a predictive model to rank the tweets according to their probability of being retweeted. In this way, users can find interesting tweets in a short time. To model the retweet behavior, we build a graph made up of three types of nodes: users, publishers and tweets. To incorporate all sources of information like users' profile, tweet quality, interaction history, etc, nodes and edges are represented by feature vectors. All these feature vectors are mapped to node weights and edge weights. Based on the graph, we propose a feature-aware factorization model to re-rank the tweets, which unifies the linear discriminative model and the low-rank factorization model seamlessly. Finally, we conducted extensive experiments on a real dataset crawled from Twitter. Experimental results show the effectiveness of our model.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127553072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 94
Exploiting social relations for sentiment analysis in microblogging 利用社交关系进行微博情感分析
Xia Hu, Lei Tang, Jiliang Tang, Huan Liu
{"title":"Exploiting social relations for sentiment analysis in microblogging","authors":"Xia Hu, Lei Tang, Jiliang Tang, Huan Liu","doi":"10.1145/2433396.2433465","DOIUrl":"https://doi.org/10.1145/2433396.2433465","url":null,"abstract":"Microblogging, like Twitter and Sina Weibo, has become a popular platform of human expressions, through which users can easily produce content on breaking news, public events, or products. The massive amount of microblogging data is a useful and timely source that carries mass sentiment and opinions on various topics. Existing sentiment analysis approaches often assume that texts are independent and identically distributed (i.i.d.), usually focusing on building a sophisticated feature space to handle noisy and short texts, without taking advantage of the fact that the microblogs are networked data. Inspired by the social sciences findings that sentiment consistency and emotional contagion are observed in social networks, we investigate whether social relations can help sentiment analysis by proposing a Sociological Approach to handling Noisy and short Texts (SANT) for sentiment classification. In particular, we present a mathematical optimization formulation that incorporates the sentiment consistency and emotional contagion theories into the supervised learning process; and utilize sparse learning to tackle noisy texts in microblogging. An empirical study of two real-world Twitter datasets shows the superior performance of our framework in handling noisy and short tweets.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"2022 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133013541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 392
Robust query rewriting using anchor data 使用锚数据进行健壮的查询重写
Nick Craswell, B. Billerbeck, Dennis Fetterly, Marc Najork
{"title":"Robust query rewriting using anchor data","authors":"Nick Craswell, B. Billerbeck, Dennis Fetterly, Marc Najork","doi":"10.1145/2433396.2433440","DOIUrl":"https://doi.org/10.1145/2433396.2433440","url":null,"abstract":"Query rewriting algorithms can be used as a form of query expansion, by combining the user's original query with automatically generated rewrites. Rewriting algorithms bring linguistic datasets to bear without the need for iterative relevance feedback, but most studies of rewriting have used proprietary datasets such as large-scale search logs. By contrast this paper uses readily available data, particularly ClueWeb09 link text with over 1.2 billion anchor phrases, to generate rewrites. To avoid overfitting, our initial analysis is performed using Million Query Track queries, leading us to identify three algorithms which perform well. We then test the algorithms on Web and newswire data. Results show good properties in terms of robustness and early precision.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130115064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Sharding social networks 社交网络分片
Quang-huy Duong, Sharad Goel, J. Hofman, Sergei Vassilvitskii
{"title":"Sharding social networks","authors":"Quang-huy Duong, Sharad Goel, J. Hofman, Sergei Vassilvitskii","doi":"10.1145/2433396.2433424","DOIUrl":"https://doi.org/10.1145/2433396.2433424","url":null,"abstract":"Online social networking platforms regularly support hundreds of millions of users, who in aggregate generate substantially more data than can be stored on any single physical server. As such, user data are distributed, or sharded, across many machines. A key requirement in this setting is rapid retrieval not only of a given user's information, but also of all data associated with his or her social contacts, suggesting that one should consider the topology of the social network in selecting a sharding policy. In this paper we formalize the problem of efficiently sharding large social network databases, and evaluate several sharding strategies, both analytically and empirically. We find that random sharding---the de facto standard---results in provably poor performance even when frequently accessed nodes are replicated to many shards. By contrast, we demonstrate that one can substantially reduce querying costs by identifying and assigning tightly knit communities to shards. In particular, our theoretical analysis motivates a novel, scalable sharding algorithm that outperforms both random and location-based sharding schemes.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129282256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Maguro, a system for indexing and searching over very large text collections 一个索引和搜索非常大的文本集合的系统
Knut Magne Risvik, Trishul M. Chilimbi, Henry Tan, Karthik Kalyanaraman, Chris Anderson
{"title":"Maguro, a system for indexing and searching over very large text collections","authors":"Knut Magne Risvik, Trishul M. Chilimbi, Henry Tan, Karthik Kalyanaraman, Chris Anderson","doi":"10.1145/2433396.2433486","DOIUrl":"https://doi.org/10.1145/2433396.2433486","url":null,"abstract":"Maguro is a system for efficiently searching very large collections of text content of up to 1 trillion documents at low cost. Search engines span across content that is very dynamic and highly augmented with metadata to the tail content of the web. A long tail distribution of content calls for different trade-offs in the design space for good efficiency across the entire index range. Maguro is designed for the long tail of content with less dynamics and less metadata, but very good cost efficiency. Maguro is part of the serving stack in Bing and allows us to scale the index significantly better.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"218 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132235175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信