Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining最新文献_第10页

Exploiting Human Mobility Patterns for Point-of-Interest Recommendation 利用人类移动模式进行兴趣点推荐

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI: 10.1145/3159652.3170459

Zijun Yao

{"title":"Exploiting Human Mobility Patterns for Point-of-Interest Recommendation","authors":"Zijun Yao","doi":"10.1145/3159652.3170459","DOIUrl":"https://doi.org/10.1145/3159652.3170459","url":null,"abstract":"Point-of-interest (POI) recommendation, which provides personalized recommendation of places to mobile users, is an important task in location-based social networks (LBSNs). Unlike traditional interest-oriented merchandise recommendation, POI recommendation is more complex due to the timing effects: we need to examine whether the POI fits a user»s availability. While there are some prior studies which consider temporal effects by solely using check-in timestamps for modeling, they suffer from check-in data sparsity. Recent years, the advent in positioning technology has accumulated a variety of urban data related to human mobility. There is a potential to exploit human mobility patterns from heterogeneous information sources for improving POI recommendation. To this end, we propose a novel method which incorporates the degree of temporal matching between users and POIs into personalized POI recommendations. Specifically, we profile the temporal popularity of POIs, learn the latent regularity to characterize users, and conduct comprehensive experiments with real-world data. Evaluation results demonstrate the effectiveness of the proposed method.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116896391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Offline A/B Testing for Recommender Systems 推荐系统的离线A/B测试

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-01-22 DOI: 10.1145/3159652.3159687

Alexandre Gilotte, Clément Calauzènes, Thomas Nedelec, A. Abraham, Simon Dollé

{"title":"Offline A/B Testing for Recommender Systems","authors":"Alexandre Gilotte, Clément Calauzènes, Thomas Nedelec, A. Abraham, Simon Dollé","doi":"10.1145/3159652.3159687","DOIUrl":"https://doi.org/10.1145/3159652.3159687","url":null,"abstract":"Online A/B testing evaluates the impact of a new technology by running it in a real production environment and testing its performance on a subset of the users of the platform. It is a well-known practice to run a preliminary offline evaluation on historical data to iterate faster on new ideas, and to detect poor policies in order to avoid losing money or breaking the system. For such offline evaluations, we are interested in methods that can compute offline an estimate of the potential uplift of performance generated by a new technology. Offline performance can be measured using estimators known as counterfactual or off-policy estimators. Traditional counterfactual estimators, such as capped importance sampling or normalised importance sampling, exhibit unsatisfying bias-variance compromises when experimenting on personalized product recommendation systems. To overcome this issue, we model the bias incurred by these estimators rather than bound it in the worst case, which leads us to propose a new counterfactual estimator. We provide a benchmark of the different estimators showing their correlation with business metrics observed by running online A/B tests on a large-scale commercial recommender system.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121136215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 188

Can you Trust the Trend?: Discovering Simpson's Paradoxes in Social Data 你能相信趋势吗?:发现社会数据中的辛普森悖论

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-01-13 DOI: 10.1145/3159652.3159684

N. Alipourfard, Peter G. Fennell, Kristina Lerman

引用次数: 26

Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution 机器学习的理论障碍与因果革命的七个火花

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-01-11 DOI: 10.1145/3159652.3176182

J. Pearl

引用次数: 279

Neural Networks for Information Retrieval 信息检索中的神经网络

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-01-07 DOI: 10.1145/3159652.3162009

Tom Kenter, Alexey Borisov, Christophe Van Gysel, Mostafa Dehghani, M. de Rijke, Bhaskar Mitra

引用次数: 3

Ballpark Crowdsourcing: The Wisdom of Rough Group Comparisons 棒球场众包:粗略群体比较的智慧

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2017-12-13 DOI: 10.1145/3159652.3159670

Tom Hope, Dafna Shahaf

{"title":"Ballpark Crowdsourcing: The Wisdom of Rough Group Comparisons","authors":"Tom Hope, Dafna Shahaf","doi":"10.1145/3159652.3159670","DOIUrl":"https://doi.org/10.1145/3159652.3159670","url":null,"abstract":"Crowdsourcing has become a popular method for collecting labeled training data. However, in many practical scenarios traditional labeling can be difficult for crowdworkers(for example, if the data is high-dimensional or unintuitive, or the labels are continuous). In this work, we develop a novel model for crowdsourcing that can complement standard practices by exploiting people»s intuitions about groups and relations between them. We employ a recent machine learning setting, called Ballpark Learning, that can estimate individual labels given only coarse, aggregated signal over groups of data points. To address the important case of continuous labels, we extend the Ballpark setting(which focused on classification) to regression problems. We formulate the problem as a convex optimization problem and propose fast, simple methods with an innate robustness to outliers. We evaluate our methods on real-world datasets, demonstrating how useful constraints about groups can be harnessed from a crowd of non-experts. Our methods can rival supervised models trained on many true labels, and can obtain considerably better results from the crowd than a standard label-collection process(for a lower price). By collecting rough guesses on groups of instances and using machine learning to infer the individual labels, our lightweight framework is able to address core crowdsourcing challenges and train machine learning models in a cost-effective way.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134029482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Listening to Chaotic Whispers: A Deep Learning Framework for News-oriented Stock Trend Prediction 聆听混乱的低语:面向新闻的股票趋势预测的深度学习框架

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2017-12-06 DOI: 10.1145/3159652.3159690

Ziniu Hu, Weiqing Liu, Jiang Bian, Xuanzhe Liu, Tie-Yan Liu

{"title":"Listening to Chaotic Whispers: A Deep Learning Framework for News-oriented Stock Trend Prediction","authors":"Ziniu Hu, Weiqing Liu, Jiang Bian, Xuanzhe Liu, Tie-Yan Liu","doi":"10.1145/3159652.3159690","DOIUrl":"https://doi.org/10.1145/3159652.3159690","url":null,"abstract":"Stock trend prediction plays a critical role in seeking maximized profit from the stock investment. However, precise trend prediction is very difficult since the highly volatile and non-stationary nature of the stock market. Exploding information on the Internet together with the advancing development of natural language processing and text mining techniques have enabled investors to unveil market trends and volatility from online content. Unfortunately, the quality, trustworthiness, and comprehensiveness of online content related to stock market vary drastically, and a large portion consists of the low-quality news, comments, or even rumors. To address this challenge, we imitate the learning process of human beings facing such chaotic online news, driven by three principles: sequential content dependency, diverse influence, and effective and efficient learning. In this paper, to capture the first two principles, we designed a Hybrid Attention Networks(HAN) to predict the stock trend based on the sequence of recent related news. Moreover, we apply the self-paced learning mechanism to imitate the third principle. Extensive experiments on real-world stock market data demonstrate the effectiveness of our framework. A further simulation illustrates that a straightforward trading strategy based on our proposed framework can significantly increase the annualized return.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129300634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 261

SHINE: Signed Heterogeneous Information Network Embedding for Sentiment Link Prediction 面向情感链接预测的签名异构信息网络嵌入

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2017-12-03 DOI: 10.1145/3159652.3159666

Hongwei Wang, Fuzheng Zhang, Min Hou, Xing Xie, M. Guo, Qi Liu

{"title":"SHINE: Signed Heterogeneous Information Network Embedding for Sentiment Link Prediction","authors":"Hongwei Wang, Fuzheng Zhang, Min Hou, Xing Xie, M. Guo, Qi Liu","doi":"10.1145/3159652.3159666","DOIUrl":"https://doi.org/10.1145/3159652.3159666","url":null,"abstract":"In online social networks people often express attitudes towards others, which forms massive sentiment links among users. Predicting the sign of sentiment links is a fundamental task in many areas such as personal advertising and public opinion analysis. Previous works mainly focus on textual sentiment classification, however, text information can only disclose the \"tip of the iceberg»» about users» true opinions, of which the most are unobserved but implied by other sources of information such as social relation and users» profile. To address this problem, in this paper we investigate how to predict possibly existing sentiment links in the presence of heterogeneous information. First, due to the lack of explicit sentiment links in mainstream social networks, we establish a labeled heterogeneous sentiment dataset which consists of users» sentiment relation, social relation and profile knowledge by entity-level sentiment extraction method. Then we propose a novel and flexible end-to-end Signed Heterogeneous Information Network Embedding (SHINE) framework to extract users» latent representations from heterogeneous networks and predict the sign of unobserved sentiment links. SHINE utilizes multiple deep autoencoders to map each user into a low-dimension feature space while preserving the network structure. We demonstrate the superiority of SHINE over state-of-the-art baselines on link prediction and node recommendation in two real-world datasets. The experimental results also prove the efficacy of SHINE in cold start scenario.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122883161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 279

Joint Non-negative Matrix Factorization for Learning Ideological Leaning on Twitter 联合非负矩阵分解学习Twitter上的思想学习

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2017-11-28 DOI: 10.1145/3159652.3159669

Preethi Lahoti, Venkata Rama Kiran Garimella, A. Gionis

{"title":"Joint Non-negative Matrix Factorization for Learning Ideological Leaning on Twitter","authors":"Preethi Lahoti, Venkata Rama Kiran Garimella, A. Gionis","doi":"10.1145/3159652.3159669","DOIUrl":"https://doi.org/10.1145/3159652.3159669","url":null,"abstract":"People are shifting from traditional news sources to online news at an incredibly fast rate. However, the technology behind online news consumption promotes content that confirms the users» existing point of view. This phenomenon has led to polarization of opinions and intolerance towards opposing views. Thus, a key problem is to model information filter bubbles on social media and design methods to eliminate them. In this paper, we use a machine-learning approach to learn a liberal-conservative ideology space on Twitter, and show how we can use the learned latent space to tackle the filter bubble problem. We model the problem of learning the liberal-conservative ideology space of social media users and media sources as a constrained non-negative matrix-factorization problem. Our model incorporates the social-network structure and content-consumption information in a joint factorization problem with shared latent factors. We validate our model and solution on a real-world Twitter dataset consisting of controversial topics, and show that we are able to separate users by ideology with over 90% purity. When applied to media sources, our approach estimates ideology scores that are highly correlated(Pearson correlation 0.9) with ground-truth ideology scores. Finally, we demonstrate the utility of our model in real-world scenarios, by illustrating how the learned ideology latent space can be used to develop exploratory and interactive interfaces that can help users in diffusing their information filter bubble.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123020783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation 利用群众来发现和减少假新闻和错误信息的传播

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2017-11-27 DOI: 10.1145/3159652.3159734

Jooyeon Kim, Behzad Tabibian, Alice H. Oh, B. Scholkopf, M. Gomez-Rodriguez

{"title":"Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation","authors":"Jooyeon Kim, Behzad Tabibian, Alice H. Oh, B. Scholkopf, M. Gomez-Rodriguez","doi":"10.1145/3159652.3159734","DOIUrl":"https://doi.org/10.1145/3159652.3159734","url":null,"abstract":"Online social networking sites are experimenting with the following crowd-powered procedure to reduce the spread of fake news and misinformation: whenever a user is exposed to a story through her feed, she can flag the story as misinformation and, if the story receives enough flags, it is sent to a trusted third party for fact checking. If this party identifies the story as misinformation, it is marked as disputed. However, given the uncertain number of exposures, the high cost of fact checking, and the trade-off between flags and exposures, the above mentioned procedure requires careful reasoning and smart algorithms which, to the best of our knowledge, do not exist to date. In this paper, we first introduce a flexible representation of the above procedure using the framework of marked temporal point processes. Then, we develop a scalable online algorithm, CURB, to select which stories to send for fact checking and when to do so to efficiently reduce the spread of misinformation with provable guarantees. In doing so, we need to solve a novel stochastic optimal control problem for stochastic differential equations with jumps, which is of independent interest. Experiments on two real-world datasets gathered from Twitter and Weibo show that our algorithm may be able to effectively reduce the spread of fake news and misinformation.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130378203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 187