Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining最新文献

筛选
英文 中文
Visualizing Graph Differences from Social Media Streams 可视化来自社交媒体流的图形差异
Minjeong Shin, Dongwoo Kim, Jae Hee Lee, Umanga Bista, Lexing Xie
{"title":"Visualizing Graph Differences from Social Media Streams","authors":"Minjeong Shin, Dongwoo Kim, Jae Hee Lee, Umanga Bista, Lexing Xie","doi":"10.1145/3289600.3290616","DOIUrl":"https://doi.org/10.1145/3289600.3290616","url":null,"abstract":"We propose KGdiff, a new interactive visualization tool for social media content focusing on entities and relationships. The core component is a layout algorithm that highlights the differences between two graphs. We apply this algorithm on knowledge graphs consisting of named entities and their relations extracted from text streams over different time periods. The visualization system provides additional information such as the volume and frequency ranking of entities and allows users to select which parts of the graph to visualize interactively. On Twitter and news article collections, KGdiff allows users to compare different data subsets. Results of such comparisons often reveal topical or geographical changes in a discussion. More broadly, graph differences are useful for a wide range of relational data comparison tasks, such as comparing social interaction graphs, identifying changes in user behavior, or discovering differences in graphs from distinct sources, geography, or political stance.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132627295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MSA: Jointly Detecting Drug Name and Adverse Drug Reaction Mentioning Tweets with Multi-Head Self-Attention MSA:联合检测具有多头自关注的药品名称和药品不良反应提及推文
Chuhan Wu, Fangzhao Wu, Zhigang Yuan, Junxin Liu, Yongfeng Huang, Xing Xie
{"title":"MSA: Jointly Detecting Drug Name and Adverse Drug Reaction Mentioning Tweets with Multi-Head Self-Attention","authors":"Chuhan Wu, Fangzhao Wu, Zhigang Yuan, Junxin Liu, Yongfeng Huang, Xing Xie","doi":"10.1145/3289600.3290980","DOIUrl":"https://doi.org/10.1145/3289600.3290980","url":null,"abstract":"Twitter is a popular social media platform for information sharing and dissemination. Many Twitter users post tweets to share their experiences about drugs and adverse drug reactions. Automatic detection of tweets mentioning drug names and adverse drug reactions at a large scale has important applications such as pharmacovigilance. However, detecting drug name and adverse drug reaction mentioning tweets is very challenging, because tweets are usually very noisy and informal, and there are massive misspellings and user-created abbreviations for these mentions. In addition, these mentions are usually context dependent. In this paper, we propose a neural approach with hierarchical tweet representation and multi-head self-attention mechanism to jointly detect tweets mentioning drug names and adverse drug reactions. In order to alleviate the influence of massive misspellings and user-created abbreviations in tweets, we propose to use a hierarchical tweet representation model to first learn word representations from characters and then learn tweet representations from words. In addition, we propose to use multi-head self-attention mechanism to capture the interactions between words to fully model the contexts of tweets. Besides, we use additive attention mechanism to select the informative words to learn more informative tweet representations. Experimental results validate the effectiveness of our approach.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133415225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Sequential Test for Selecting the Better Variant: Online A/B testing, Adaptive Allocation, and Continuous Monitoring 选择更好的变体的顺序测试:在线A/B测试,自适应分配和持续监控
Nianqiao Ju, D. Hu, Adam Henderson, Liangjie Hong
{"title":"A Sequential Test for Selecting the Better Variant: Online A/B testing, Adaptive Allocation, and Continuous Monitoring","authors":"Nianqiao Ju, D. Hu, Adam Henderson, Liangjie Hong","doi":"10.1145/3289600.3291025","DOIUrl":"https://doi.org/10.1145/3289600.3291025","url":null,"abstract":"Online A/B tests play an instrumental role for Internet companies to improve products and technologies in a data-driven manner. An online A/B test, in its most straightforward form, can be treated as a static hypothesis test where traditional statistical tools such as p-values and power analysis might be applied to help decision makers determine which variant performs better. However, a static A/B test presents both time cost and the opportunity cost for rapid product iterations. For time cost, a fast-paced product evolution pushes its shareholders to consistently monitor results from online A/B experiments, which usually invites peeking and altering experimental designs as data collected. It is recognized that this flexibility might harm statistical guarantees if not introduced in the right way, especially when online tests are considered as static hypothesis tests. For opportunity cost, a static test usually entails a static allocation of users into different variants, which prevents an immediate roll-out of the better version to larger audience or risks of alienating users who may suffer from a bad experience. While some works try to tackle these challenges, no prior method focuses on a holistic solution to both issues. In this paper, we propose a unified framework utilizing sequential analysis and multi-armed bandit to address time cost and the opportunity cost of static online tests simultaneously. In particular, we present an imputed sequential Girshick test that accommodates online data and dynamic allocation of data. The unobserved potential outcomes are treated as missing data and are imputed using empirical averages. Focusing on the binomial model, we demonstrate that the proposed imputed Girshick test achieves Type-I error and power control with both a fixed allocation ratio and an adaptive allocation such as Thompson Sampling through extensive experiments. In addition, we also run experiments on historical Etsy.com A/B tests to show the reduction in opportunity cost when using the proposed method.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115612999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Slice: Scalable Linear Extreme Classifiers Trained on 100 Million Labels for Related Searches Slice:在1亿个相关搜索标签上训练的可扩展线性极端分类器
Himanshu Jain, V. Balasubramanian, Bhanu Chunduri, M. Varma
{"title":"Slice: Scalable Linear Extreme Classifiers Trained on 100 Million Labels for Related Searches","authors":"Himanshu Jain, V. Balasubramanian, Bhanu Chunduri, M. Varma","doi":"10.1145/3289600.3290979","DOIUrl":"https://doi.org/10.1145/3289600.3290979","url":null,"abstract":"This paper reformulates the problem of recommending related queries on a search engine as an extreme multi-label learning task. Extreme multi-label learning aims to annotate each data point with the most relevant subset of labels from an extremely large label set. Each of the top 100 million queries on Bing was treated as a separate label in the proposed reformulation and an extreme classifier was learnt which took the user's query as input and predicted the relevant subset of 100 million queries as output. Unfortunately, state-of-the-art extreme classifiers have not been shown to scale beyond 10 million labels and have poor prediction accuracies for queries. This paper therefore develops the Slice algorithm which can be accurately trained on low-dimensional, dense deep learning features popularly used to represent queries and which efficiently scales to 100 million labels and 240 million training points. Slice achieves this by reducing the training and prediction times from linear to logarithmic in the number of labels based on a novel negative sampling technique. This allows the proposed reformulation to address some of the limitations of traditional related searches approaches in terms of coverage, density and quality. Experiments on publically available extreme classification datasets with low-dimensional dense features as well as related searches datasets mined from the Bing logs revealed that slice could be more accurate than leading extreme classifiers while also scaling to 100 million labels. Furthermore, slice was found to improve the accuracy of recommendations by 10% as compared to state-of-the-art related searches techniques. Finally, when added to the ensemble in production in Bing, slice was found to increase the trigger coverage by 52%, the suggestion density by 33%, the overall success rate by 2.6% and the success rate for tail queries by 12.6%. Slice's source code can be downloaded from [21].","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"424 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115933064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 118
Lightweight Lexical and Semantic Evidence for Detecting Classes Among Wikipedia Articles 轻量级词汇和语义证据在维基百科条目中检测类
Marius Pasca, Travis Wolfe
{"title":"Lightweight Lexical and Semantic Evidence for Detecting Classes Among Wikipedia Articles","authors":"Marius Pasca, Travis Wolfe","doi":"10.1145/3289600.3291020","DOIUrl":"https://doi.org/10.1145/3289600.3291020","url":null,"abstract":"A supervised method relies on simple, lightweight features in order to distinguish Wikipedia articles that are classes (Shield volcano) from other articles (Kilauea). The features are lexical or semantic in nature. Experimental results in multiple languages over multiple evaluation sets demonstrate the superiority of the proposed method over previous work.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123059301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CORALS: Who Are My Potential New Customers? Tapping into the Wisdom of Customers' Decisions 珊瑚:谁是我潜在的新客户?挖掘客户决策的智慧
Ruirui Li, Jyun-Yu Jiang, C. Ju, Wei Wang
{"title":"CORALS: Who Are My Potential New Customers? Tapping into the Wisdom of Customers' Decisions","authors":"Ruirui Li, Jyun-Yu Jiang, C. Ju, Wei Wang","doi":"10.1145/3289600.3290995","DOIUrl":"https://doi.org/10.1145/3289600.3290995","url":null,"abstract":"Identifying and recommending potential new customers for local businesses are crucial to the survival and success of local businesses. A key component to identifying the right customers is to understand the decision-making process of choosing a business over the others. However, modeling this process is an extremely challenging task as a decision is influenced by multiple factors. These factors include but are not limited to an individual's taste or preference, the location accessibility of a business, and the reputation of a business from social media. Most of the recommender systems lack the power to integrate multiple factors together and are hardly extensible to accommodate new incoming factors. In this paper, we introduce a unified framework, CORALS, which considers the personal preferences of different customers, the geographical influence, and the reputation of local businesses in the customer recommendation task. To evaluate the proposed model, we conduct a series of experiments to extensively compare with 12 state-of-the-art methods using two real-world datasets. The results demonstrate that CORALS outperforms all these baselines by a significant margin in most scenarios. In addition to identifying potential new customers, we also break down the analysis for different types of businesses to evaluate the impact of various factors that may affect customers' decisions. This information, in return, provides a great resource for local businesses to adjust their advertising strategies and business services to attract more prospective customers.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123926190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Investment Recommendation System for Low-Liquidity Online Peer to Peer Lending (P2PL) Marketplaces 低流动性在线p2p借贷(P2PL)市场投资推荐系统
K. Ren, Avinash Malik
{"title":"Investment Recommendation System for Low-Liquidity Online Peer to Peer Lending (P2PL) Marketplaces","authors":"K. Ren, Avinash Malik","doi":"10.1145/3289600.3290959","DOIUrl":"https://doi.org/10.1145/3289600.3290959","url":null,"abstract":"Online P2PL systems allow lending and borrowing between peers without the need for intermediaries such as banks. Convenience and high rate of returns have made P2PL systems very popular. Recommendation systems have been developed to help lenders make wise investment decisions, lowering the chances of overall default. However, P2PL marketplace suffers from low financial liquidity, i.e., loans of different grades are not always available for investment. Moreover, P2PL investments are long term (usually a few years), hence, incorrect investment cannot be liquidated easily. Overall, the state-of-the-art recommendation systems do not account for the low market liquidity and hence, can lead to unwise investment decisions. In this paper we remedy this shortcoming by building a recommendation framework that builds an investment portfolio, which results in the highest return and the lowest risk along with a statistical measure of the number of days required for the amount to be completely funded. Our recommendation system predicts the grade and number of loans that will appear in the future when constructing the investment portfolio. Experimental results show that our recommendation engine outperforms the current state-of-the-art techniques. Our recommendation system can increase the probability of achieving the highest return with the lowest risk by ~ 69%.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121211695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Bridging Models for Popularity Prediction on Social Media 社交媒体人气预测的桥接模型
Swapnil Mishra
{"title":"Bridging Models for Popularity Prediction on Social Media","authors":"Swapnil Mishra","doi":"10.1145/3289600.3291598","DOIUrl":"https://doi.org/10.1145/3289600.3291598","url":null,"abstract":"Understanding and predicting the popularity of online items is an important open problem in social media analysis. Most of the recent work on popularity prediction is either based on learning a variety of features from full network data or using generative processes to model the event time data. We identify two gaps in the current state of the art prediction models. The first is the unexplored connection and comparison between the two aforementioned approaches. In our work, we bridge gap between feature-driven and generative models by modelling social cascade with a marked Hawkes self-exciting point process. We then learn a predictive layer on top for popularity prediction using a collection of cascade history. Secondly, the existing methods typically focus on a single source of external influence, whereas for many types of online content such as YouTube videos or news articles, attention is driven by multiple heterogeneous sources simultaneously - e.g. microblogs or traditional media coverage. We propose a recurrent neural network based model for asynchronous streams that connects multiple streams of different granularity via joint inference. We further design two new measures, one to explain the viral potential of videos, the other to uncover latent influences including seasonal trends. This work provides accurate and explainable popularity predictions, as well as computational tools for content producers and marketers to allocate resources for promotion campaigns.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124363987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
The Influence of Image Search Intents on User Behavior and Satisfaction 图像搜索意图对用户行为和满意度的影响
Zhijing Wu, Yiqun Liu, Qianfan Zhang, Kailu Wu, Min Zhang, Shaoping Ma
{"title":"The Influence of Image Search Intents on User Behavior and Satisfaction","authors":"Zhijing Wu, Yiqun Liu, Qianfan Zhang, Kailu Wu, Min Zhang, Shaoping Ma","doi":"10.1145/3289600.3291013","DOIUrl":"https://doi.org/10.1145/3289600.3291013","url":null,"abstract":"Understanding search intents behind queries is of vital importance for improving search performance or designing better evaluation metrics. Although there exist many efforts in Web search user intent taxonomies and investigating how users' interaction behaviors vary with the intent types, only a few of them have been made specifically for the image search scenario. Different from previous works which investigate image search user behavior and task characteristics based on either lab studies or large scale log analysis, we conducted a field study which lasts one month and involves 2,040 search queries from 555 search tasks. By this means, we collected relatively large amount of practical search behavior data with extensive first-tier annotation from users. With this data set, we investigate how various image search intents affect users' search behavior, and try to adopt different signals to predict search satisfaction under the certain intent. Meanwhile, external assessors were also employed to categorize each search task using four orthogonal intent taxonomies. Based on the hypothesis that behavior is dependent of task type, we analyze user search behavior on the field study data, examining characteristics of the session, click and mouse patterns. We also link the search satisfaction prediction to image search intent, which shows that different types of signals play different roles in satisfaction prediction as intent varies. Our findings indicate the importance of considering search intent in user behavior analysis and satisfaction prediction in image search.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"54 12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132235514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Attitude Detection for One-Round Conversation: Jointly Extracting Target-Polarity Pairs 一轮会话的姿态检测:联合提取目标极性对
Zhaohao Zeng, Ruihua Song, Pingping Lin, T. Sakai
{"title":"Attitude Detection for One-Round Conversation: Jointly Extracting Target-Polarity Pairs","authors":"Zhaohao Zeng, Ruihua Song, Pingping Lin, T. Sakai","doi":"10.1145/3289600.3291038","DOIUrl":"https://doi.org/10.1145/3289600.3291038","url":null,"abstract":"We tackle Attitude Detection, which we define as the task of extracting the replier's attitude, i.e., a target-polarity pair, from a given one-round conversation. While previous studies considered Target Extraction and Polarity Classification separately, we regard them as subtasks of Attitude Detection. Our experimental results show that treating the two subtasks independently is not the optimal solution for Attitude Detection, as achieving high performance in each subtask is not sufficient for obtaining correct target-polarity pairs. Our jointly trained model AD-NET substantially outperforms the separately trained models by alleviating the target-polarity mismatch problem. Moreover, we proposed a method utilising the attitude detection model to improve retrieval-based chatbots by re-ranking the response candidates with attitude features. Human evaluation indicates that with attitude detection integrated, the new responses to the sampled queries from are statistically significantly more consistent, coherent, engaging and informative than the original ones obtained from a commercial chatbot.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121455445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信