The World Wide Web Conference最新文献

筛选
英文 中文
Learn2Clean: Optimizing the Sequence of Tasks for Web Data Preparation Learn2Clean:优化Web数据准备的任务顺序
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313602
Laure Berti-Équille
{"title":"Learn2Clean: Optimizing the Sequence of Tasks for Web Data Preparation","authors":"Laure Berti-Équille","doi":"10.1145/3308558.3313602","DOIUrl":"https://doi.org/10.1145/3308558.3313602","url":null,"abstract":"Data cleaning and preparation has been a long-standing challenge in data science to avoid incorrect results and misleading conclusions obtained from dirty data. For a given dataset and a given machine learning-based task, a plethora of data preprocessing techniques and alternative data curation strategies may lead to dramatically different outputs with unequal quality performance. Most current work on data cleaning and automated machine learning, however, focus on developing either cleaning algorithms or user-guided systems or argue to rely on a principled method to select the sequence of data preprocessing steps that can lead to the optimal quality performance of. In this paper, we propose Learn2Clean, a method based on Q-Learning, a model-free reinforcement learning technique that selects, for a given dataset, a ML model, and a quality performance metric, the optimal sequence of tasks for preprocessing the data such that the quality of the ML model result is maximized. As a preliminary validation of our approach in the context of Web data analytics, we present some promising results on data preparation for clustering, regression, and classification on real-world data.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"380 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80660923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
LearnerExp: Exploring and Explaining the Time Management of Online Learning Activity LearnerExp:探索和解释在线学习活动的时间管理
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3314140
Huan He, Q. Zheng, Bo Dong
{"title":"LearnerExp: Exploring and Explaining the Time Management of Online Learning Activity","authors":"Huan He, Q. Zheng, Bo Dong","doi":"10.1145/3308558.3314140","DOIUrl":"https://doi.org/10.1145/3308558.3314140","url":null,"abstract":"How do learners schedule their online learning? This issue is concerned by both course instructors and researchers, especially in the context of self-paced online learning environment. Many indicators and methods have been proposed to understand and improve the time management of learning activities, however, there are few tools of visualizing, comparing and exploring the time management to gain intuitive understanding. In this demo, we introduce the LearnExp, an interactive visual analytic system designed to explore the temporal patterns of learning activities and explain the relationships between academic performance and these patterns. This system will help instructors to comparatively explore the distribution of learner activities from multiple aspects, and to visually explain the time management of different learner groups with the prediction of learning performance.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82424072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Keyphrase Extraction from Disaster-related Tweets 从与灾难相关的推文中提取关键词
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313696
Jishnu Ray Chowdhury, Cornelia Caragea, Doina Caragea
{"title":"Keyphrase Extraction from Disaster-related Tweets","authors":"Jishnu Ray Chowdhury, Cornelia Caragea, Doina Caragea","doi":"10.1145/3308558.3313696","DOIUrl":"https://doi.org/10.1145/3308558.3313696","url":null,"abstract":"While keyphrase extraction has received considerable attention in recent years, relatively few studies exist on extracting keyphrases from social media platforms such as Twitter, and even fewer for extracting disaster-related keyphrases from such sources. During a disaster, keyphrases can be extremely useful for filtering relevant tweets that can enhance situational awareness. Previously, joint training of two different layers of a stacked Recurrent Neural Network for keyword discovery and keyphrase extraction had been shown to be effective in extracting keyphrases from general Twitter data. We improve the model's performance on both general Twitter data and disaster-related Twitter data by incorporating contextual word embeddings, POS-tags, phonetics, and phonological features. Moreover, we discuss the shortcomings of the often used F1-measure for evaluating the quality of predicted keyphrases with respect to the ground truth annotations. Instead of the F1-measure, we propose the use of embedding-based metrics to better capture the correctness of the predicted keyphrases. In addition, we also present a novel extension of an embedding-based metric. The extension allows one to better control the penalty for the difference in the number of ground-truth and predicted keyphrases.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78811666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Dealing with Interdependencies and Uncertainty in Multi-Channel Advertising Campaigns Optimization 多渠道广告活动优化中的相互依赖和不确定性处理
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313470
Alessandro Nuara, Nicola Sosio, F. Trovò, Maria Chiara Zaccardi, N. Gatti, Marcello Restelli
{"title":"Dealing with Interdependencies and Uncertainty in Multi-Channel Advertising Campaigns Optimization","authors":"Alessandro Nuara, Nicola Sosio, F. Trovò, Maria Chiara Zaccardi, N. Gatti, Marcello Restelli","doi":"10.1145/3308558.3313470","DOIUrl":"https://doi.org/10.1145/3308558.3313470","url":null,"abstract":"In 2017, Internet ad spending reached 209 billion USD worldwide, while, e.g., TV ads brought in 178 billion USD. An Internet advertising campaign includes up to thousands of sub-campaigns on multiple channels, e.g., search, social, display, whose parameters (bid and daily budget) need to be optimized every day, subject to a (cumulative) budget constraint. Such a process is often unaffordable for humans and its automation is crucial. As also shown by marketing funnel models, the sub-campaigns are usually interdependent, e.g., display ads induce awareness, increasing the number of impressions-and, thus, also the number of conversions-of search ads. This interdependence is widely exploited by humans in the optimization process, whereas, to the best of our knowledge, no algorithm takes it into account. In this paper, we provide the first model capturing the sub-campaigns interdependence. We also provide the IDIL algorithm, which, employing Granger Causality and Gaussian Processes, learns from past data, and returns an optimal stationary bid/daily budget allocation. We prove theoretical guarantees on the loss of IDIL w.r.t. the clairvoyant solution, and we show empirical evidence of its superiority in both realistic and real-world settings when compared with existing approaches.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86885265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Predicting Human Mobility via Variational Attention 通过变分注意预测人类流动性
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313610
Qiang Gao, Fan Zhou, Goce Trajcevski, Kunpeng Zhang, Ting Zhong, Fengli Zhang
{"title":"Predicting Human Mobility via Variational Attention","authors":"Qiang Gao, Fan Zhou, Goce Trajcevski, Kunpeng Zhang, Ting Zhong, Fengli Zhang","doi":"10.1145/3308558.3313610","DOIUrl":"https://doi.org/10.1145/3308558.3313610","url":null,"abstract":"An important task in Location based Social Network applications is to predict mobility - specifically, user's next point-of-interest (POI) - challenging due to the implicit feedback of footprints, sparsity of generated check-ins, and the joint impact of historical periodicity and recent check-ins. Motivated by recent success of deep variational inference, we propose VANext (Variational Attention based Next) POI prediction: a latent variable model for inferring user's next footprint, with historical mobility attention. The variational encoding captures latent features of recent mobility, followed by searching the similar historical trajectories for periodical patterns. A trajectory convolutional network is then used to learn historical mobility, significantly improving the efficiency over often used recurrent networks. A novel variational attention mechanism is proposed to exploit the periodicity of historical mobility patterns, combined with recent check-in preference to predict next POIs. We also implement a semi-supervised variant - VANext-S, which relies on variational encoding for pre-training all current trajectories in an unsupervised manner, and uses the latent variables to initialize the current trajectory learning. Experiments conducted on real-world datasets demonstrate that VANext and VANext-S outperform the state-of-the-art human mobility prediction models.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89095619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 96
Improving Outfit Recommendation with Co-supervision of Fashion Generation 在时尚生成的共同监督下改进服装推荐
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313614
Yujie Lin, Pengjie Ren, Zhumin Chen, Z. Ren, Jun Ma, M. de Rijke
{"title":"Improving Outfit Recommendation with Co-supervision of Fashion Generation","authors":"Yujie Lin, Pengjie Ren, Zhumin Chen, Z. Ren, Jun Ma, M. de Rijke","doi":"10.1145/3308558.3313614","DOIUrl":"https://doi.org/10.1145/3308558.3313614","url":null,"abstract":"The task of fashion recommendation includes two main challenges: visual understanding and visual matching. Visual understanding aims to extract effective visual features. Visual matching aims to model a human notion of compatibility to compute a match between fashion items. Most previous studies rely on recommendation loss alone to guide visual understanding and matching. Although the features captured by these methods describe basic characteristics (e.g., color, texture, shape) of the input items, they are not directly related to the visual signals of the output items (to be recommended). This is problematic because the aesthetic characteristics (e.g., style, design), based on which we can directly infer the output items, are lacking. Features are learned under the recommendation loss alone, where the supervision signal is simply whether the given two items are matched or not. To address this problem, we propose a neural co-supervision learning framework, called the FAshion Recommendation Machine (FARM). FARM improves visual understanding by incorporating the supervision of generation loss, which we hypothesize to be able to better encode aesthetic information. FARM enhances visual matching by introducing a novel layer-to-layer matching mechanism to fuse aesthetic information more effectively, and meanwhile avoiding paying too much attention to the generation quality and ignoring the recommendation performance. Extensive experiments on two publicly available datasets show that FARM outperforms state-of-the-art models on outfit recommendation, in terms of AUC and MRR. Detailed analyses of generated and recommended items demonstrate that FARM can encode better features and generate high quality images as references to improve recommendation performance.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"88 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81226225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Context-aware Variational Trajectory Encoding and Human Mobility Inference 情境感知变分轨迹编码与人类移动性推理
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313608
Fan Zhou, Xiaoli Yue, Goce Trajcevski, Ting Zhong, Kunpeng Zhang
{"title":"Context-aware Variational Trajectory Encoding and Human Mobility Inference","authors":"Fan Zhou, Xiaoli Yue, Goce Trajcevski, Ting Zhong, Kunpeng Zhang","doi":"10.1145/3308558.3313608","DOIUrl":"https://doi.org/10.1145/3308558.3313608","url":null,"abstract":"Unveiling human mobility patterns is an important task for many downstream applications like point-of-interest (POI) recommendation and personalized trip planning. Compelling results exist in various sequential modeling methods and representation techniques. However, discovering and exploiting the context of trajectories in terms of abstract topics associated with the motion can provide a more comprehensive understanding of the dynamics of patterns. We propose a new paradigm for moving pattern mining based on learning trajectory context, and a method - Context-Aware Variational Trajectory Encoding and Human Mobility Inference (CATHI) - for learning user trajectory representation via a framework consisting of: (1) a variational encoder and a recurrent encoder; (2) a variational attention layer; (3) two decoders. We simultaneously tackle two subtasks: (T1) recovering user routes (trajectory reconstruction); and (T2) predicting the trip that the user would travel (trajectory prediction). We show that the encoded contextual trajectory vectors efficiently characterize the hierarchical mobility semantics, from which one can decode the implicit meanings of trajectories. We evaluate our method on several public datasets and demonstrate that the proposed CATHI can efficiently improve the performance of both subtasks, compared to state-of-the-art approaches.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86225839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Focusing Attention Network for Answer Ranking 聚焦注意力网络的答案排名
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313518
Yufei Xie, Shuchun Liu, Tangren Yao, Yao Peng, Zhao Lu
{"title":"Focusing Attention Network for Answer Ranking","authors":"Yufei Xie, Shuchun Liu, Tangren Yao, Yao Peng, Zhao Lu","doi":"10.1145/3308558.3313518","DOIUrl":"https://doi.org/10.1145/3308558.3313518","url":null,"abstract":"Answer ranking is an important task in Community Question Answering (CQA), by which “Good” answers should be ranked in the front of “Bad” or “Potentially Useful” answers. The state of the art is the attention-based classification framework that learns the mapping between the questions and the answers. However, we observe that existing attention-based methods perform poorly on complicated question-answer pairs. One major reason is that existing methods cannot get accurate alignments between questions and answers for such pairs. We call the phenomenon “attention divergence”. In this paper, we propose a new attention mechanism, called Focusing Attention Network(FAN), which can automatically draw back the divergent attention by adding the semantic, and metadata features. Our Model can focus on the most important part of the sentence and therefore improve the answer ranking performance. Experimental results on the CQA dataset of SemEval-2016 and SemEval-2017 demonstrate that our method respectively attains 79.38 and 88.72 on MAP and outperforms the Top-1 system in the shared task by 0.19 and 0.29.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86275771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Revisiting Mobile Advertising Threats with MAdLife 《MAdLife》重新审视手机广告威胁
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313549
Gong Chen, W. Meng, J. Copeland
{"title":"Revisiting Mobile Advertising Threats with MAdLife","authors":"Gong Chen, W. Meng, J. Copeland","doi":"10.1145/3308558.3313549","DOIUrl":"https://doi.org/10.1145/3308558.3313549","url":null,"abstract":"Online advertising is one of the primary funding sources for various of content, services, and applications on both web and mobile platforms. Mobile in-app advertising reuses many existing web technologies under the same ad-serving model (i.e., users - publishers - ad networks - advertisers). Nevertheless, mobile in-app advertising is different from the traditional web advertising in many aspects. For example, malicious app developers can generate fraudulent ad clicks in an automated fashion, but malicious web publishers have to launch click fraud with bots. In spite of using the same underlying web infrastructure, advertising threats behave differently on the two platforms. Existing works have studied separately click fraud and malvertising in the mobile setting. However, it is unknown if there exists a relationship between these two dominant threats. In this paper, we present an ad collection framework – MAdLife – on Android to capture all the in-app ad traffic generated during an ad's entire lifespan. MAdLife allows us to revisit both threats in a fine-grained manner and study the relationship between them. It further enables the exploration of other threats related to ad landing pages. We analyzed 5.7K Android apps crawled from the Google Play Store, and collected 83K ads and their landing pages using MAdLife. Similar to traditional web ads, 58K ads landed on web pages. We discovered 37 click-fraud apps, and found that 1.49% of the 58K ads were malicious. We also revealed a strong correlation between fraudulent apps and malicious ads. Specifically, 15.44% of malicious ads originated from the fraudulent apps. Conversely, 18.36% of the ads served in the fraudulent apps were malicious, while only 1.28% were malicious in the rest apps. This suggests that users of fraudulent apps are much more (14x) likely to encounter malicious ads. Additionally, we discovered that 243 popular JavaScript snippets embedded by over 10% of the landing pages were malicious. Finally, we conducted the first analysis on inappropriate mobile in-app ads.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88786332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
MCVAE: Margin-based Conditional Variational Autoencoder for Relation Classification and Pattern Generation 基于边缘的关系分类和模式生成条件变分自编码器
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313436
Fenglong Ma, Yaliang Li, Chenwei Zhang, Jing Gao, Nan Du, Wei Fan
{"title":"MCVAE: Margin-based Conditional Variational Autoencoder for Relation Classification and Pattern Generation","authors":"Fenglong Ma, Yaliang Li, Chenwei Zhang, Jing Gao, Nan Du, Wei Fan","doi":"10.1145/3308558.3313436","DOIUrl":"https://doi.org/10.1145/3308558.3313436","url":null,"abstract":"Relation classification is a basic yet important task in natural language processing. Existing relation classification approaches mainly rely on distant supervision, which assumes that a bag of sentences mentioning a pair of entities and extracted from a given corpus should express the same relation type of this entity pair. The training of these models needs a lot of high-quality bag-level data. However, in some specific domains, such as medical domain, it is difficult to obtain sufficient and high-quality sentences in a text corpus that mention two entities with a certain medical relation between them. In such a case, it is hard for existing discriminative models to capture the representative features (i.e., common patterns) from diversely expressed entity pairs with a given relation. Thus, the classification performance cannot be guaranteed when limited features are obtained from the corpus. To address this challenge, in this paper, we propose to employ a generative model, called conditional variational autoencoder (CVAE), to handle the pattern sparsity. We define that each relation has an individually learned latent distribution from all possible sentences expressing this relation. As these distributions are learned based on the purpose of input reconstruction, the model's classification ability may not be strong enough and should be improved. By distinguishing the differences among different relation distributions, a margin-based regularizer is designed, which leads to a margin-based CVAE (MCVAE) that can significantly enhance the classification ability. Besides, MCVAE can automatically generate semantically meaningful patterns that describe the given relations. Experiments on two real-world datasets validate the effectiveness of the proposed MCVAE on the tasks of relation classification and relation-specific pattern generation.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88808786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信