The World Wide Web Conference最新文献

筛选
英文 中文
Rethinking the Detection of Child Sexual Abuse Imagery on the Internet 对网络儿童性侵图像检测的再思考
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313482
Elie Bursztein, Einat Clarke, Michelle DeLaune, David M. Elifff, Nick Hsu, Lindsey Olson, John Shehan, Madhukar Thakur, Kurt Thomas, Travis Bright
{"title":"Rethinking the Detection of Child Sexual Abuse Imagery on the Internet","authors":"Elie Bursztein, Einat Clarke, Michelle DeLaune, David M. Elifff, Nick Hsu, Lindsey Olson, John Shehan, Madhukar Thakur, Kurt Thomas, Travis Bright","doi":"10.1145/3308558.3313482","DOIUrl":"https://doi.org/10.1145/3308558.3313482","url":null,"abstract":"Over the last decade, the illegal distribution of child sexual abuse imagery (CSAI) has transformed alongside the rise of online sharing platforms. In this paper, we present the first longitudinal measurement study of CSAI distribution online and the threat it poses to society's ability to combat child sexual abuse. Our results illustrate that CSAI has grown exponentially-to nearly 1 million detected events per month-exceeding the capabilities of independent clearinghouses and law enforcement to take action. In order to scale CSAI protections moving forward, we discuss techniques for automating detection and response by using recent advancements in machine learning.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81404019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Addressing Trust Bias for Unbiased Learning-to-Rank 解决信任偏见的无偏学习排序
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313697
Aman Agarwal, Xuanhui Wang, Cheng Li, Michael Bendersky, Marc Najork
{"title":"Addressing Trust Bias for Unbiased Learning-to-Rank","authors":"Aman Agarwal, Xuanhui Wang, Cheng Li, Michael Bendersky, Marc Najork","doi":"10.1145/3308558.3313697","DOIUrl":"https://doi.org/10.1145/3308558.3313697","url":null,"abstract":"Existing unbiased learning-to-rank models use counterfactual inference, notably Inverse Propensity Scoring (IPS), to learn a ranking function from biased click data. They handle the click incompleteness bias, but usually assume that the clicks are noise-free, i.e., a clicked document is always assumed to be relevant. In this paper, we relax this unrealistic assumption and study click noise explicitly in the unbiased learning-to-rank setting. Specifically, we model the noise as the position-dependent trust bias and propose a noise-aware Position-Based Model, named TrustPBM, to better capture user click behavior. We propose an Expectation-Maximization algorithm to estimate both examination and trust bias from click data in TrustPBM. Furthermore, we show that it is difficult to use a pure IPS method to incorporate click noise and thus propose a novel method that combines a Bayes rule application with IPS for unbiased learning-to-rank. We evaluate our proposed methods on three personal search data sets and demonstrate that our proposed model can significantly outperform the existing unbiased learning-to-rank methods.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81049513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 74
A Family of Fuzzy Orthogonal Projection Models for Monolingual and Cross-lingual Hypernymy Prediction 一组用于单语和跨语夸张预测的模糊正交投影模型
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313439
Chengyu Wang, Yan Fan, Xiaofeng He, Aoying Zhou
{"title":"A Family of Fuzzy Orthogonal Projection Models for Monolingual and Cross-lingual Hypernymy Prediction","authors":"Chengyu Wang, Yan Fan, Xiaofeng He, Aoying Zhou","doi":"10.1145/3308558.3313439","DOIUrl":"https://doi.org/10.1145/3308558.3313439","url":null,"abstract":"Hypernymy is a semantic relation, expressing the “is-a” relation between a concept and its instances. Such relations are building blocks for large-scale taxonomies, ontologies and knowledge graphs. Recently, much progress has been made for hypernymy prediction in English using textual patterns and/or distributional representations. However, applying such techniques to other languages is challenging due to the high language dependency of these methods and the lack of large training datasets of lower-resourced languages. In this work, we present a family of fuzzy orthogonal projection models for both monolingual and cross-lingual hypernymy prediction. For the monolingual task, we propose a Multi-Wahba Projection (MWP) model to distinguish hypernymy vs. non-hypernymy relations based on word embeddings. This model establishes distributional fuzzy mappings from embeddings of a term to those of its hypernyms and non-hypernyms, which consider the complicated linguistic regularities of these relations. For cross-lingual hypernymy prediction, a Transfer MWP (TMWP) model is proposed to transfer the semantic knowledge from the source language to target languages based on neural word translation. Additionally, an Iterative Transfer MWP (ITMWP) model is built upon TMWP, which augments the training sets of target languages when target languages are lower-resourced with limited training data. Experiments show i) MWP outperforms previous methods over two hypernymy prediction tasks for English; and ii) TMWP and ITMWP are effective to predict hypernymy over seven non-English languages.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"158 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81560714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
ContraVis: Contrastive and Visual Topic Modeling for Comparing Document Collections 对比:比较文档集合的对比和可视化主题建模
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313617
T. Le, L. Akoglu
{"title":"ContraVis: Contrastive and Visual Topic Modeling for Comparing Document Collections","authors":"T. Le, L. Akoglu","doi":"10.1145/3308558.3313617","DOIUrl":"https://doi.org/10.1145/3308558.3313617","url":null,"abstract":"Given posts on 'abortion' and posts on 'religion' from a political forum, how can we find topics that are discriminative and those in common? In general, (1) how can we compare and contrast two or more different ('labeled') document collections? Moreover, (2) how can we visualize the data (in 2-d or 3-d) to best reflect the similarities and differences between the collections? We introduce (to the best of our knowledge) the first contrastive and visual topic model, called ContraVis, that jointly addresses both problems: (1) contrastive topic modeling, and (2) contrastive visualization. That is, ContraVis learns not only latent topics but also embeddings for the documents, topics and labels for visualization. ContraVis exhibits three key properties by design. It is (i) Contrastive: It enables comparative analysis of different document corpora by extracting latent discriminative and common topics across labeled documents; (ii) Visually-expressive: Different from numerous existing models, it also produces a visualization for all of the documents, labels, and the extracted topics, where proximity in the coordinate space is reflective of proximity in semantic space; (iii) Unified: It extracts topics and visual coordinates simultaneously under a joint model. Through extensive experiments on real-world datasets, we show ContraVis 's potential for providing visual contrastive analysis of multiple document collections. We show both qualitatively and quantitatively that ContraVis significantly outperforms both unsupervised and supervised state-of-the-art topic models in contrastive power, semantic coherence and visual effectiveness.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81912831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Self- and Cross-Excitation in Stack Exchange Question & Answer Communities 堆栈交换问答社区中的自激励和交叉激励
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313440
Tiago Santos, Simon Walk, Roman Kern, M. Strohmaier, D. Helic
{"title":"Self- and Cross-Excitation in Stack Exchange Question & Answer Communities","authors":"Tiago Santos, Simon Walk, Roman Kern, M. Strohmaier, D. Helic","doi":"10.1145/3308558.3313440","DOIUrl":"https://doi.org/10.1145/3308558.3313440","url":null,"abstract":"In this paper, we quantify the impact of self- and cross-excitation on the temporal development of user activity in Stack Exchange Question & Answer (Q&A) communities. We study differences in user excitation between growing and declining Stack Exchange communities, and between those dedicated to STEM and humanities topics by leveraging Hawkes processes. We find that growing communities exhibit early stage, high cross-excitation by a small core of power users reacting to the community as a whole, and strong long-term self-excitation in general and cross-excitation by casual users in particular, suggesting community openness towards less active users. Further, we observe that communities in the humanities exhibit long-term power user cross-excitation, whereas in STEM communities activity is more evenly distributed towards casual user self-excitation. We validate our findings via permutation tests and quantify the impact of these excitation effects with a range of prediction experiments. Our work enables researchers to quantitatively assess the evolution and activity potential of Q&A communities.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"128 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84963657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization 基于稀疏矩阵分解的大规模网络嵌入
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313446
J. Qiu, Yuxiao Dong, Hao Ma, Jun Yu Li, Chi Wang, Kuansan Wang, Jie Tang
{"title":"NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization","authors":"J. Qiu, Yuxiao Dong, Hao Ma, Jun Yu Li, Chi Wang, Kuansan Wang, Jie Tang","doi":"10.1145/3308558.3313446","DOIUrl":"https://doi.org/10.1145/3308558.3313446","url":null,"abstract":"We study the problem of large-scale network embedding, which aims to learn latent representations for network mining applications. Previous research shows that 1) popular network embedding benchmarks, such as DeepWalk, are in essence implicitly factorizing a matrix with a closed form, and 2) the explicit factorization of such matrix generates more powerful embeddings than existing methods. However, directly constructing and factorizing this matrix-which is dense-is prohibitively expensive in terms of both time and space, making it not scalable for large networks. In this work, we present the algorithm of large-scale network embedding as sparse matrix factorization (NetSMF). NetSMF leverages theories from spectral sparsification to efficiently sparsify the aforementioned dense matrix, enabling significantly improved efficiency in embedding learning. The sparsified matrix is spectrally close to the original dense one with a theoretically bounded approximation error, which helps maintain the representation power of the learned embeddings. We conduct experiments on networks of various scales and types. Results show that among both popular benchmarks and factorization based methods, NetSMF is the only method that achieves both high efficiency and effectiveness. We show that NetSMF requires only 24 hours to generate effective embeddings for a large-scale academic collaboration network with tens of millions of nodes, while it would cost DeepWalk months and is computationally infeasible for the dense matrix factorization solution. The source code of NetSMF is publicly available1.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86204512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 143
MARINE: Multi-relational Network Embeddings with Relational Proximity and Node Attributes 基于关系接近和节点属性的多关系网络嵌入
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313715
Ming-Han Feng, Chin-Chi Hsu, Cheng-te Li, Mi-Yen Yeh, Shou-de Lin
{"title":"MARINE: Multi-relational Network Embeddings with Relational Proximity and Node Attributes","authors":"Ming-Han Feng, Chin-Chi Hsu, Cheng-te Li, Mi-Yen Yeh, Shou-de Lin","doi":"10.1145/3308558.3313715","DOIUrl":"https://doi.org/10.1145/3308558.3313715","url":null,"abstract":"Network embedding aims at learning an effective vector transformation for entities in a network. We observe that there are two diverse branches of network embedding: for homogeneous graphs and for multi-relational graphs. This paper then proposes MARINE, a unified embedding framework for both homogeneous and multi-relational networks to preserve both the proximity and relation information. We also extend the framework to incorporate existing features of nodes in a graph, which can further be exploited for the ensemble of embedding. Our solution possesses complexity linear to the number of edges, which is suitable for large-scale network applications. Experiments conducted on several real-world network datasets, along with applications in link prediction and multi-label classification, exhibit the superiority of our proposed MARINE.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"114 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83601329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Review Response Generation in E-Commerce Platforms with External Product Information 具有外部产品信息的电子商务平台的评审响应生成
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313581
Lujun Zhao, Kaisong Song, Changlong Sun, Qi Zhang, Xuanjing Huang, Xiaozhong Liu
{"title":"Review Response Generation in E-Commerce Platforms with External Product Information","authors":"Lujun Zhao, Kaisong Song, Changlong Sun, Qi Zhang, Xuanjing Huang, Xiaozhong Liu","doi":"10.1145/3308558.3313581","DOIUrl":"https://doi.org/10.1145/3308558.3313581","url":null,"abstract":"''User reviews” are becoming an essential component of e-commerce. When buyers write a negative or doubting review, ideally, the sellers need to quickly give a response to minimize the potential impact. When the number of reviews is growing at a frightening speed, there is an urgent need to build a response writing assistant for customer service providers. In order to generate high-quality responses, the algorithm needs to consume and understand the information from both the original review and the target product. The classical sequence-to-sequence (Seq2Seq) methods can hardly satisfy this requirement. In this study, we propose a novel deep neural network model based on the Seq2Seq framework for the review response generation task in e-commerce platforms, which can incorporate product information by a gated multi-source attention mechanism and a copy mechanism. Moreover, we employ a reinforcement learning technique to reduce the exposure bias problem. To evaluate the proposed model, we constructed a large-scale dataset from a popular e-commerce website, which contains product information. Empirical studies on both automatic evaluation metrics and human annotations show that the proposed model can generate informative and diverse responses, significantly outperforming state-of-the-art text generation models.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78812517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Sensitivity Analysis of Centralities on Unweighted Networks 非加权网络中心性的敏感性分析
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313422
Shogo Murai, Yuichi Yoshida
{"title":"Sensitivity Analysis of Centralities on Unweighted Networks","authors":"Shogo Murai, Yuichi Yoshida","doi":"10.1145/3308558.3313422","DOIUrl":"https://doi.org/10.1145/3308558.3313422","url":null,"abstract":"Revealing important vertices is a fundamental task in network analysis. As such, many indicators have been proposed for doing so, which are collectively called centralities. However, the abundance of studies on centralities blurs their differences. In this work, we compare centralities based on their sensivitity to modifications in the graph. Specifically, we introduce a quantitative measure called (average-case) edge sensitivity, which measures how much the centrality value of a uniformly chosen vertex (or an edge) changes when we remove a uniformly chosen edge. Edge sensitivity is applicable to unweighted graphs, regarding which, to our knowledge, there has been no theoretical analysis of the centralities. We conducted a theoretical analysis of the edge sensitivities of six major centralities: the closeness centrality, harmonic centrality, betweenness centrality, endpoint betweenness centrality, PageRank, and spanning tree centrality. Our experimental results on synthetic and real graphs confirm the tendency predicted by the theoretical analysis. We also discuss an extension of edge sensitivity to the setting that we remove a uniformly chosen set of edges of size k for an integer k = 1.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78573560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Unnecessarily Identifiable: Quantifying the fingerprintability of browser extensions due to bloat 不必要的可识别性:由于膨胀而量化浏览器扩展的可识别性
The World Wide Web Conference Pub Date : 2019-05-13 DOI: 10.1145/3308558.3313458
Oleksii Starov, Pierre Laperdrix, A. Kapravelos, Nick Nikiforakis
{"title":"Unnecessarily Identifiable: Quantifying the fingerprintability of browser extensions due to bloat","authors":"Oleksii Starov, Pierre Laperdrix, A. Kapravelos, Nick Nikiforakis","doi":"10.1145/3308558.3313458","DOIUrl":"https://doi.org/10.1145/3308558.3313458","url":null,"abstract":"In this paper, we investigate to what extent the page modifications that make browser extensions fingerprintable are necessary for their operation. We characterize page modifications that are completely unnecessary for the extension's functionality as extension bloat. By analyzing 58,034 extensions from the Google Chrome store, we discovered that 5.7% of them were unnecessarily identifiable because of extension bloat. To protect users against unnecessary extension fingerprinting due to bloat, we describe the design and implementation of an in-browser mechanism that provides coarse-grained access control for extensions on all websites. The proposed mechanism and its built-in policies, does not only protect users from fingerprinting, but also offers additional protection against malicious extensions exfiltrating user data from sensitive websites.","PeriodicalId":23013,"journal":{"name":"The World Wide Web Conference","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84294147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信