Proceedings of the 25th International Conference on World Wide Web最新文献_第9页

A Study of Retrieval Models for Long Documents and Queries in Information Retrieval 信息检索中长文档和查询的检索模型研究

Proceedings of the 25th International Conference on World Wide Web Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883009

Ronan Cummins

引用次数: 10

Economic Recommendation with Surplus Maximization 盈余最大化的经济建议

Proceedings of the 25th International Conference on World Wide Web Pub Date : 2016-04-11 DOI: 10.1145/2872427.2882973

Yongfeng Zhang, Qi Zhao, Yi Zhang, D. Friedman, Min Zhang, Yiqun Liu, Shaoping Ma

{"title":"Economic Recommendation with Surplus Maximization","authors":"Yongfeng Zhang, Qi Zhao, Yi Zhang, D. Friedman, Min Zhang, Yiqun Liu, Shaoping Ma","doi":"10.1145/2872427.2882973","DOIUrl":"https://doi.org/10.1145/2872427.2882973","url":null,"abstract":"A prime function of many major World Wide Web applications is Online Service Allocation (OSA), the function of matching individual consumers with particular services/goods (which may include loans or jobs as well as products) each with its own producer. In the applications of interest, consumers are free to choose, so OSA usually takes the form of personalized recommendation or search in practice. The performance metrics of recommender and search systems currently tend to focus on just one side of the match, in some cases the consumers (e.g. satisfaction) and in other cases the producers (e.g., profit). However, a sustainable OSA platform needs benefit both consumers and producers; otherwise the neglected party eventually may stop using it. In this paper, we show how to adapt economists' traditional idea of maximizing total surplus (the sum of consumer net benefit and producer profit) to the heterogeneous world of online service allocation, in an effort to promote the web intelligence for social good in online eco-systems. Modifications of traditional personalized recommendation algorithms enable us to apply Total Surplus Maximization (TSM) to three very different types of real-world tasks -- e-commerce, P2P lending and freelancing. The results for all three tasks suggest that TSM compares very favorably to currently popular approaches, to the benefit of both producers and consumers.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"215 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74494527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36

Mining User Intentions from Medical Queries: A Neural Network Based Heterogeneous Jointly Modeling Approach 从医疗查询中挖掘用户意图:一种基于神经网络的异构联合建模方法

Proceedings of the 25th International Conference on World Wide Web Pub Date : 2016-04-11 DOI: 10.1145/2872427.2874810

Chenwei Zhang, Wei Fan, Nan Du, Philip S. Yu

{"title":"Mining User Intentions from Medical Queries: A Neural Network Based Heterogeneous Jointly Modeling Approach","authors":"Chenwei Zhang, Wei Fan, Nan Du, Philip S. Yu","doi":"10.1145/2872427.2874810","DOIUrl":"https://doi.org/10.1145/2872427.2874810","url":null,"abstract":"Text queries are naturally encoded with user intentions. An intention detection task tries to model and discover intentions that user encoded in text queries. Unlike conventional text classification tasks where the label of text is highly correlated with some topic-specific words, words from different topic categories tend to co-occur in medical related queries. Besides the existence of topic-specific words and word order, word correlations and the way words organized into sentence are crucial to intention detection tasks. In this paper, we present a neural network based jointly modeling approach to model and capture user intentions in medical related text queries. Regardless of the exact words in text queries, the proposed method incorporates two types of heterogeneous information: 1) pairwise word feature correlations and 2) part-of-speech tags of a sentence to jointly model user intentions. Variable-length text queries are first inherently taken care of by a fixed-size pairwise feature correlation matrix. Moreover, convolution and pooling operations are applied on feature correlations to fully exploit latent semantic structure within the query. Sentence rephrasing is finally introduced as a data augmentation technique to improve model generalization ability during model training. Experiment results on real world medical queries have shown that the proposed method is able to extract complete and precise user intentions from text queries.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"396 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76436484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 47

A Piggyback System for Joint Entity Mention Detection and Linking in Web Queries Web查询中联合实体提及检测与链接的背载系统

Proceedings of the 25th International Conference on World Wide Web Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883061

M. Cornolti, P. Ferragina, Massimiliano Ciaramita, Stefan Rüd, Hinrich Schütze

{"title":"A Piggyback System for Joint Entity Mention Detection and Linking in Web Queries","authors":"M. Cornolti, P. Ferragina, Massimiliano Ciaramita, Stefan Rüd, Hinrich Schütze","doi":"10.1145/2872427.2883061","DOIUrl":"https://doi.org/10.1145/2872427.2883061","url":null,"abstract":"In this paper we study the problem of linking open-domain web-search queries towards entities drawn from the full entity inventory of Wikipedia articles. We introduce SMAPH-2, a second-order approach that, by piggybacking on a web search engine, alleviates the noise and irregularities that characterize the language of queries and puts queries in a larger context in which it is easier to make sense of them. The key algorithmic idea underlying SMAPH-2 is to first discover a candidate set of entities and then link-back those entities to their mentions occurring in the input query. This allows us to confine the possible concepts pertinent to the query to only the ones really mentioned in it. The link-back is implemented via a collective disambiguation step based upon a supervised ranking model that makes one joint prediction for the annotation of the complete query optimizing directly the F1 measure. We evaluate both known features, such as word embeddings and semantic relatedness among entities, and several novel features such as an approximate distance between mentions and entities (which can handle spelling errors). We demonstrate that SMAPH-2 achieves state-of-the-art performance on the ERD@SIGIR2014 benchmark. We also publish GERDAQ (General Entity Recognition, Disambiguation and Annotation in Queries), a novel, public dataset built specifically for web-query entity linking via a crowdsourcing effort. SMAPH-2 outperforms the benchmarks by comparable margins also on GERDAQ.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83917914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33

An Empirical Analysis of Algorithmic Pricing on Amazon Marketplace 亚马逊市场算法定价的实证分析

Proceedings of the 25th International Conference on World Wide Web Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883089

Le Chen, A. Mislove, Christo Wilson

{"title":"An Empirical Analysis of Algorithmic Pricing on Amazon Marketplace","authors":"Le Chen, A. Mislove, Christo Wilson","doi":"10.1145/2872427.2883089","DOIUrl":"https://doi.org/10.1145/2872427.2883089","url":null,"abstract":"The rise of e-commerce has unlocked practical applications for algorithmic pricing (also called dynamic pricing algorithms), where sellers set prices using computer algorithms. Travel websites and large, well known e-retailers have already adopted algorithmic pricing strategies, but the tools and techniques are now available to small-scale sellers as well. While algorithmic pricing can make merchants more competitive, it also creates new challenges. Examples have emerged of cases where competing pieces of algorithmic pricing software interacted in unexpected ways and produced unpredictable prices, as well as cases where algorithms were intentionally designed to implement price fixing. Unfortunately, the public currently lack comprehensive knowledge about the prevalence and behavior of algorithmic pricing algorithms in-the-wild. In this study, we develop a methodology for detecting algorithmic pricing, and use it empirically to analyze their prevalence and behavior on Amazon Marketplace. We gather four months of data covering all merchants selling any of 1,641 best-seller products. Using this dataset, we are able to uncover the algorithmic pricing strategies adopted by over 500 sellers. We explore the characteristics of these sellers and characterize the impact of these strategies on the dynamics of the marketplace.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81711552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 171

Beyond the Baseline: Establishing the Value in Mobile Phone Based Poverty Estimates 超越基线:建立基于手机的贫困估算的价值

Proceedings of the 25th International Conference on World Wide Web Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883076

Chris Smith-Clarke, L. Capra

{"title":"Beyond the Baseline: Establishing the Value in Mobile Phone Based Poverty Estimates","authors":"Chris Smith-Clarke, L. Capra","doi":"10.1145/2872427.2883076","DOIUrl":"https://doi.org/10.1145/2872427.2883076","url":null,"abstract":"Within the remit of `Data for Development' there have been a number of promising recent works that investigate the use of mobile phone Call Detail Records (CDRs) to estimate the spatial distribution of poverty or socio-economic status. The methods being developed have the potential to offer immense value to organisations and agencies who currently struggle to identify the poorest parts of a country, due to the lack of reliable and up to date survey data in certain parts of the world. However, the results of this research have thus far only been presented in isolation rather than in comparison to any alternative approach or benchmark. Consequently, the true practical value of these methods remains unknown. Here, we seek to allay this shortcoming, by proposing two baseline poverty estimators grounded on concrete usage scenarios: one that exploits correlation with population density only, to be used when no poverty data exists at all; and one that also exploits spatial autocorrelation, to be used when poverty data has been collected for a few regions within a country. We then compare the predictive performance of these baseline models with models that also include features derived from CDRs, so to establish their real added value. We present extensive analysis of the performance of all these models on data acquired for two developing countries -- Senegal and Ivory Coast. Our results reveal that CDR-based models do provide more accurate estimates in most cases; however, the improvement is modest and more significant when estimating (extreme) poverty intensity rates rather than mean wealth.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90582530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Exploiting Dining Preference for Restaurant Recommendation 利用用餐偏好进行餐厅推荐

Proceedings of the 25th International Conference on World Wide Web Pub Date : 2016-04-11 DOI: 10.1145/2872427.2882995

Fuzheng Zhang, Nicholas Jing Yuan, Kai Zheng, Defu Lian, Xing Xie, Y. Rui

{"title":"Exploiting Dining Preference for Restaurant Recommendation","authors":"Fuzheng Zhang, Nicholas Jing Yuan, Kai Zheng, Defu Lian, Xing Xie, Y. Rui","doi":"10.1145/2872427.2882995","DOIUrl":"https://doi.org/10.1145/2872427.2882995","url":null,"abstract":"The wide adoption of location-based services provide the potential to understand people's mobility pattern at an unprecedented level, which can also enable food-service industry to accurately predict consumers' dining behavior. In this paper, based on users' dining implicit feedbacks (restaurant visit via check-ins), explicit feedbacks (restaurant reviews) as well as some meta data (e.g., location, user demographics, restaurant attributes), we aim at recommending each user a list of restaurants for his next dining. Implicit and Explicit feedbacks of dining behavior exhibit different characteristics of user preference. Therefore, in our work, user's dining preference mainly contains two parts: implicit preference coming from check-in data (implicit feedbacks) and explicit preference coming from rating and review data (explicit feedbacks). For implicit preference, we first apply a probabilistic tensor factorization model (PTF) to capture preference in a latent subspace. Then, in order to incorporate contextual signals from meta data, we extend PTF by proposing an Implicit Preference Model (IPM), which can simultaneously capture users'/restaurants'/time' preference in the collaborative filtering and dining preference in a specific context (e.g., spatial distance preference, environmental preference). For explicit preference, we propose Explicit Preference Model (EPM) by combining matrix factorization with topic modeling to discover the user preference embedded both in rating score and text content. Finally, we design a unified model termed as Collective Implicit Explicit Preference Model (CIEPM) to combine implicit and explicit preference together for restaurant recommendation. To evaluate the performance of our system, we conduct extensive experiments with large-scale datasets covering hundreds of thousands of users and restaurants. The results reveal that our system is effective for restaurant recommendation.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87797722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

On the Relevance of Irrelevant Alternatives 论不相关选择的相关性

Proceedings of the 25th International Conference on World Wide Web Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883025

Austin R. Benson, Ravi Kumar, A. Tomkins

{"title":"On the Relevance of Irrelevant Alternatives","authors":"Austin R. Benson, Ravi Kumar, A. Tomkins","doi":"10.1145/2872427.2883025","DOIUrl":"https://doi.org/10.1145/2872427.2883025","url":null,"abstract":"Multinomial logistic regression is a powerful tool to model choice from a finite set of alternatives, but it comes with an underlying model assumption called the independence of irrelevant alternatives, stating that any item added to the set of choices will decrease all other items' likelihood by an equal fraction. We perform statistical tests of this assumption across a variety of datasets and give results showing how often it is violated. When this axiom is violated, choice theorists will often invoke a richer model known as nested logistic regression, in which information about competition among items is encoded in a tree structure known as a nest. However, to our knowledge there are no known algorithms to induce the correct nest structure. We present the first such algorithm, which runs in quadratic time under an oracle model, and we pair it with a matching lower bound. We then perform experiments on synthetic and real datasets to validate the algorithm, and show that nested logit over learned nests outperforms traditional multinomial regression. Finally, in addition to automatically learning nests, we show how nests may be constructed by hand to test hypotheses about the data, and evaluated by their explanatory power.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82243638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 55

Unsupervised, Efficient and Semantic Expertise Retrieval 无监督、高效的语义专业知识检索

Proceedings of the 25th International Conference on World Wide Web Pub Date : 2016-04-11 DOI: 10.1145/2872427.2882974

Christophe Van Gysel, M. de Rijke, M. Worring

引用次数: 67

Just in Time: Controlling Temporal Performance in Crowdsourcing Competitions 及时:控制众包竞争中的时间表现

Proceedings of the 25th International Conference on World Wide Web Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883075

Markus Rokicki, Sergej Zerr, Stefan Siersdorfer

{"title":"Just in Time: Controlling Temporal Performance in Crowdsourcing Competitions","authors":"Markus Rokicki, Sergej Zerr, Stefan Siersdorfer","doi":"10.1145/2872427.2883075","DOIUrl":"https://doi.org/10.1145/2872427.2883075","url":null,"abstract":"Many modern data analytics applications in areas such as crisis management, stock trading, and healthcare, rely on components capable of nearly real-time processing of streaming data produced at varying rates. In addition to automatic processing methods, many tasks involved in those applications require further human assessment and analysis. However, current crowdsourcing platforms and systems do not support stream processing with variable loads. In this paper, we investigate how incentive mechanisms in competition based crowdsourcing can be employed in such scenarios. More specifically, we explore techniques for stimulating workers to dynamically adapt to both anticipated and sudden changes in data volume and processing demand, and we analyze effects such as data processing throughput, peak-to-average ratios, and saturation effects. To this end, we study a wide range of incentive schemes and utility functions inspired by real world applications. Our large-scale experimental evaluation with more than 900 participants and more than 6200 hours of work spent by crowd workers demonstrates that our competition based mechanisms are capable of adjusting the throughput of online workers and lead to substantial on-demand performance boosts.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79697598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8