Proceedings of The Web Conference 2020最新文献

筛选
英文 中文
Anchored Model Transfer and Soft Instance Transfer for Cross-Task Cross-Domain Learning: A Study Through Aspect-Level Sentiment Classification 面向跨任务跨领域学习的锚定模型迁移和软实例迁移:基于层面情感分类的研究
Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380034
Yaowei Zheng, Richong Zhang, Suyuchen Wang, Samuel Mensah, Yongyi Mao
{"title":"Anchored Model Transfer and Soft Instance Transfer for Cross-Task Cross-Domain Learning: A Study Through Aspect-Level Sentiment Classification","authors":"Yaowei Zheng, Richong Zhang, Suyuchen Wang, Samuel Mensah, Yongyi Mao","doi":"10.1145/3366423.3380034","DOIUrl":"https://doi.org/10.1145/3366423.3380034","url":null,"abstract":"Supervised learning relies heavily on readily available labelled data to infer an effective classification function. However, proposed methods under the supervised learning paradigm are faced with the scarcity of labelled data within domains, and are not generalized enough to adapt to other tasks. Transfer learning has proved to be a worthy choice to address these issues, by allowing knowledge to be shared across domains and tasks. In this paper, we propose two transfer learning methods Anchored Model Transfer (AMT) and Soft Instance Transfer (SIT), which are both based on multi-task learning, and account for model transfer and instance transfer, and can be combined into a common framework. We demonstrate the effectiveness of AMT and SIT for aspect-level sentiment classification showing the competitive performance against baseline models on benchmark datasets. Interestingly, we show that the integration of both methods AMT+SIT achieves state-of-the-art performance on the same task.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79881668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
The Fast and The Frugal: Tail Latency Aware Provisioning for Coping with Load Variations 快速和节俭:尾部延迟感知供应应对负载变化
Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380117
Adithya Kumar, Iyswarya Narayanan, T. Zhu, A. Sivasubramaniam
{"title":"The Fast and The Frugal: Tail Latency Aware Provisioning for Coping with Load Variations","authors":"Adithya Kumar, Iyswarya Narayanan, T. Zhu, A. Sivasubramaniam","doi":"10.1145/3366423.3380117","DOIUrl":"https://doi.org/10.1145/3366423.3380117","url":null,"abstract":"Small and medium sized enterprises use the cloud for running online, user-facing, tail latency sensitive applications with well-defined fixed monthly budgets. For these applications, adequate system capacity must be provisioned to extract maximal performance despite the challenges of uncertainties in load and request-sizes. In this paper, we address the problem of capacity provisioning under fixed budget constraints with the goal of minimizing tail latency. To tackle this problem, we propose building systems using a heterogeneous mix of low latency expensive resources and cheap resources that provide high throughput per dollar. As load changes through the day, we use more faster resources to reduce tail latency during low load periods and more cheaper resources to handle the high load periods. To achieve these tail latency benefits, we introduce novel heterogeneity-aware scheduling and autoscaling algorithms that are designed for minimizing tail latency. Using software prototypes and by running experiments on the public cloud, we show that our approach can outperform existing capacity provisioning systems by reducing the tail latency by as much as 45% under fixed-budget settings.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77997205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Keyword Search over Knowledge Graphs via Static and Dynamic Hub Labelings 通过静态和动态中心标签对知识图进行关键字搜索
Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380110
Yuxuan Shi, Gong Cheng, E. Kharlamov
{"title":"Keyword Search over Knowledge Graphs via Static and Dynamic Hub Labelings","authors":"Yuxuan Shi, Gong Cheng, E. Kharlamov","doi":"10.1145/3366423.3380110","DOIUrl":"https://doi.org/10.1145/3366423.3380110","url":null,"abstract":"Keyword search is a prominent approach to querying Web data. For graph-structured data, a widely accepted semantics for keywords is based on group Steiner trees. For this NP-hard problem, existing algorithms with provable quality guarantees have prohibitive run time on large graphs. In this paper, we propose practical approximation algorithms with a guaranteed quality of computed answers and very low run time. Our algorithms rely on Hub Labeling (HL), a structure that labels each vertex in a graph with a list of vertices reachable from it, which we use to compute distances and shortest paths. We devise two HLs: a conventional static HL that uses a new heuristic to improve pruned landmark labeling, and a novel dynamic HL that inverts and aggregates query-relevant static labels to more efficiently process vertex sets. Our approach allows to compute a reasonably good approximation of answers to keyword queries in milliseconds on million-scale knowledge graphs.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"24 3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86940128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Using Cliques with Higher-order Spectral Embeddings Improves Graph Visualizations 使用带有高阶谱嵌入的团块改进图形可视化
Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380059
Huda Nassar, Caitlin Kennedy, Shweta Jain, Austin R. Benson, D. Gleich
{"title":"Using Cliques with Higher-order Spectral Embeddings Improves Graph Visualizations","authors":"Huda Nassar, Caitlin Kennedy, Shweta Jain, Austin R. Benson, D. Gleich","doi":"10.1145/3366423.3380059","DOIUrl":"https://doi.org/10.1145/3366423.3380059","url":null,"abstract":"In the simplest setting, graph visualization is the problem of producing a set of two-dimensional coordinates for each node that meaningfully shows connections and latent structure in a graph. Among other uses, having a meaningful layout is often useful to help interpret the results from network science tasks such as community detection and link prediction. There are several existing graph visualization techniques in the literature that are based on spectral methods, graph embeddings, or optimizing graph distances. Despite the large number of methods, it is still often challenging or extremely time consuming to produce meaningful layouts of graphs with hundreds of thousands of vertices. Existing methods often either fail to produce a visualization in a meaningful time window, or produce a layout colorfully called a “hairball”, which does not illustrate any internal structure in the graph. Here, we show that adding higher-order information based on cliques to a classic eigenvector based graph visualization technique enables it to produce meaningful plots of large graphs. We further evaluate these visualizations along a number of graph visualization metrics and we find that it outperforms existing techniques on a metric that uses random walks to measure the local structure. Finally, we show many examples of how our algorithm successfully produces layouts of large networks. Code to reproduce our results is available.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90276559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
ResQueue: A Smarter Datacenter Flow Scheduler ResQueue:一个更智能的数据中心流调度程序
Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380012
Hamed Rezaei, Balajee Vamanan
{"title":"ResQueue: A Smarter Datacenter Flow Scheduler","authors":"Hamed Rezaei, Balajee Vamanan","doi":"10.1145/3366423.3380012","DOIUrl":"https://doi.org/10.1145/3366423.3380012","url":null,"abstract":"Datacenters host a mix of applications: foreground applications perform distributed lookups in order to service user queries and background applications perform batch processing tasks such as data reorganization, backup, and replication. While background flows produce the most load, foreground applications produce the most number of flows. Because packets from both types of applications compete at switches for network bandwidth, the performance of applications is sensitive to scheduling mechanisms. Existing schedulers use flow size to distinguish critical flows from non-critical flows. However, recent studies on datacenter workloads reveal that most flows are small (e.g., most flows consist of only a handful number of packets). In light of recent findings, we make the key observation that because most flows are small, flow size is not sufficient to distinguish critical flows from non-critical flows and therefore existing flow schedulers do not achieve the desired prioritization. In this paper, we introduce ResQueue, which uses a combination of flow size and packet history to calculate the priority of each flow. Our evaluation shows that ResQueue improves tail flow completion times of short flows by up to 60% over the state-of-the-art flow scheduling mechanisms.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86493573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Embedding the Scientific Record on the Web: Towards Automating Scientific Discoveries 在网络上嵌入科学记录:走向自动化科学发现
Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI: 10.1145/3366423.3382667
Y. Gil
{"title":"Embedding the Scientific Record on the Web: Towards Automating Scientific Discoveries","authors":"Y. Gil","doi":"10.1145/3366423.3382667","DOIUrl":"https://doi.org/10.1145/3366423.3382667","url":null,"abstract":"Future AI systems will be key contributors to science, but this is unlikely to happen unless we reinvent our current publications and embed our scientific records in the Web as structured Web objects. This implies that our scientific papers of the future will be complemented with explicit, structured descriptions of the experiments, software, data, and workflows used to reach new findings. These scientific papers of the future will not only culminate the promise of open science and reproducible research, but also enable the creation of AI systems that can ingest and organize scientific methods and processes, re-run experiments and re-analyze results, and explore their own hypothesis in systematic and unbiased ways. In this talk, I will describe guidelines for writing scientific papers of the future that embed the scientific record on the Web, and our progress on AI systems capable of using them to systematically explore experiments. I will also outline a research agenda with seven key characteristics for creating AI scientists that will exploit the Web to independently make new discoveries [1]. AI scientists have the potential to transform science and the processes of scientific discovery [2, 3].","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82686326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Edge formation in Social Networks to Nurture Content Creators 社交网络的优势形成以培养内容创作者
Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380267
Chun Lo, Emilie de Longueau, Ankan Saha, S. Chatterjee
{"title":"Edge formation in Social Networks to Nurture Content Creators","authors":"Chun Lo, Emilie de Longueau, Ankan Saha, S. Chatterjee","doi":"10.1145/3366423.3380267","DOIUrl":"https://doi.org/10.1145/3366423.3380267","url":null,"abstract":"Social networks act as major content marketplaces where creators and consumers come together to share and consume various kinds of content. Content ranking applications (e.g., newsfeed, moments, notifications) and edge recommendation products (e.g., connect to members, follow celebrities or groups or hashtags) on such platforms aim at improving the consumer experience. In this work, we focus on the creator experience and specifically on improving edge recommendations to better serve creators in such ecosystems. The audience and reach of creators – individuals, celebrities, publishers and companies – are critically shaped by these edge recommendation products. Hence, incorporating creator utility in such recommendations can have a material impact on their success, and in turn, on the marketplace. In this paper, we (i) propose a general framework to incorporate creator utility in edge recommendations, (ii) devise a specific method to estimate edge-level creator utilities for currently unformed edges, (iii) outline the challenges of measurement and propose a practical experiment design, and finally (iv) discuss the implementation of our proposal at scale on LinkedIn, a professional network with 645M+ members, and report our findings.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82244935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Mining Points-of-Interest for Explaining Urban Phenomena: A Scalable Variational Inference Approach 挖掘兴趣点来解释城市现象:一种可扩展的变分推理方法
Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380298
Christof Naumzik, Patrick Zoechbauer, S. Feuerriegel
{"title":"Mining Points-of-Interest for Explaining Urban Phenomena: A Scalable Variational Inference Approach","authors":"Christof Naumzik, Patrick Zoechbauer, S. Feuerriegel","doi":"10.1145/3366423.3380298","DOIUrl":"https://doi.org/10.1145/3366423.3380298","url":null,"abstract":"Points-of-interest (POIs; i.e., restaurants, bars, landmarks, and other entities) are common in web-mined data: they greatly explain the spatial distributions of urban phenomena. The conventional modeling approach relies upon feature engineering, yet it ignores the spatial structure among POIs. In order to overcome this shortcoming, the present paper proposes a novel spatial model for explaining spatial distributions based on web-mined POIs. Our key contributions are: (1) We present a rigorous yet highly interpretable formalization in order to model the influence of POIs on a given outcome variable. Specifically, we accommodate the spatial distributions of both the outcome and POIs. In our case, this modeled by the sum of latent Gaussian processes. (2) In contrast to previous literature, our model infers the influence of POIs without feature engineering, instead we model the influence of POIs via distance-weighted kernel functions with fully learnable parameterizations. (3) We propose a scalable learning algorithm based on sparse variational approximation. For this purpose, we derive a tailored evidence lower bound (ELBO) and, for appropriate likelihoods, we even show that an analytical expression can be obtained. This allows fast and accurate computation of the ELBO. Finally, the value of our approach for web mining is demonstrated in two real-world case studies. Our findings provide substantial improvements over state-of-the-art baselines with regard to both predictive and, in particular, explanatory performance. Altogether, this yields a novel spatial model for leveraging web-mined POIs. Within the context of location-based social networks, it promises an extensive range of new insights and use cases.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"505 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75214721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multi-Objective Ranking Optimization for Product Search Using Stochastic Label Aggregation 基于随机标签聚合的产品搜索多目标排序优化
Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380122
David Carmel, Elad Haramaty, Arnon Lazerson, L. Lewin-Eytan
{"title":"Multi-Objective Ranking Optimization for Product Search Using Stochastic Label Aggregation","authors":"David Carmel, Elad Haramaty, Arnon Lazerson, L. Lewin-Eytan","doi":"10.1145/3366423.3380122","DOIUrl":"https://doi.org/10.1145/3366423.3380122","url":null,"abstract":"Learning a ranking model in product search involves satisfying many requirements such as maximizing the relevance of retrieved products with respect to the user query, as well as maximizing the purchase likelihood of these products. Multi-Objective Ranking Optimization (MORO) is the task of learning a ranking model from training examples while optimizing multiple objectives simultaneously. Label aggregation is a popular solution approach for multi-objective optimization, which reduces the problem into a single objective optimization problem, by aggregating the multiple labels of the training examples, related to the different objectives, to a single label. In this work we explore several label aggregation methods for MORO in product search. We propose a novel stochastic label aggregation method which randomly selects a label per training example according to a given distribution over the labels. We provide a theoretical proof showing that stochastic label aggregation is superior to alternative aggregation approaches, in the sense that any optimal solution of the MORO problem can be generated by a proper parameter setting of the stochastic aggregation process. We experiment on three different datasets: two from the voice product search domain, and one publicly available dataset from the Web product search domain. We demonstrate empirically over these three datasets that MORO with stochastic label aggregation provides a family of ranking models that fully dominates the set of MORO models built using deterministic label aggregation.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73630719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Natural Language Annotations for Search Engine Optimization 搜索引擎优化的自然语言注释
Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI: 10.1145/3366423.3380049
P. Jenkins, Jennifer Zhao, Heath Vinicombe, Anant Subramanian, Arun Prasad, Atillia Dobi, E. Li, Yunsong Guo
{"title":"Natural Language Annotations for Search Engine Optimization","authors":"P. Jenkins, Jennifer Zhao, Heath Vinicombe, Anant Subramanian, Arun Prasad, Atillia Dobi, E. Li, Yunsong Guo","doi":"10.1145/3366423.3380049","DOIUrl":"https://doi.org/10.1145/3366423.3380049","url":null,"abstract":"Understanding content at scale is a difficult but important problem for many platforms. Many previous studies focus on content understanding to optimize engagement with existing users. However, little work studies how to leverage better content understanding to attract new users. In this work, we build a framework for generating natural language content annotations and show how they can be used for search engine optimization. The proposed framework relies on an XGBoost model that labels “pins” with high probability phrases, and a logistic regression layer that learns to rank aggregated annotations for groups of content. The pipeline identifies keywords that are descriptive and contextually meaningful. We perform a large-scale production experiment deployed on the Pinterest platform and show that natural language annotations cause a 1-2% increase in traffic from leading search engines. This increase is statistically significant. Finally, we explore and interpret the characteristics of our annotations framework.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76820148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信