Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining最新文献

Cognitive Biases in Crowdsourcing 众包中的认知偏差

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI: 10.1145/3159652.3159654

Carsten Eickhoff

引用次数: 112

Performance Analysis of a Privacy Constrained kNN Recommendation Using Data Sketches 基于数据草图的隐私约束kNN推荐性能分析

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI: 10.1145/3159652.3159673

A. Afsharinejad, N. Hurley

{"title":"Performance Analysis of a Privacy Constrained kNN Recommendation Using Data Sketches","authors":"A. Afsharinejad, N. Hurley","doi":"10.1145/3159652.3159673","DOIUrl":"https://doi.org/10.1145/3159652.3159673","url":null,"abstract":"This paper evaluates two algorithms, BLIP and JLT, for creating differentially private data sketches of user profiles, in terms of their ability to protect a kNN collaborative filtering algorithm from an inference attack by third-parties. The transformed user profiles are employed in a user-based top-N collaborative filtering system. For the first time, a theoretical analysis of the BLIP is carried out, to derive expressions that relate its parameters to its performance. This allows the two techniques to be fairly compared. The impact of deploying these approaches on the utility of the system---its ability to make good recommendations, and on its privacy level---the ability of third-parties to make inferences about the underlying user preferences, is examined. An active inference attack is evaluated, that consists of the injection of a number of tailored sybil profiles into the system database. User profile data of targeted users is then inferred from the recommendations made to the sybils. Although the differentially private sketches are designed to allow the transformed user profiles to be published without compromising privacy, the attack we examine does not use such information and depends only on some pre-existing knowledge of some user preferences as well as the neighbourhood size of the kNN algorithm. Our analysis therefore assesses in practical terms a relatively weak privacy attack, which is extremely simple to apply in systems that allow low-cost generation of sybils. We find that, for a given differential privacy level, the BLIP injects less noise into the system, but for a given level of noise, the JLT offers a more compact representation.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115530078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Mining Twitter for Fine-Grained Political Opinion Polarity Classification, Ideology Detection and Sarcasm Detection 挖掘Twitter的细粒度政治观点极性分类、意识形态检测和讽刺检测

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI: 10.1145/3159652.3170461

Sandeepa Kannangara

引用次数: 32

Neural Graph Learning: Training Neural Networks Using Graphs 神经图学习:使用图训练神经网络

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI: 10.1145/3159652.3159731

T. Bui, Sujith Ravi, Vivek Ramavajjala

{"title":"Neural Graph Learning: Training Neural Networks Using Graphs","authors":"T. Bui, Sujith Ravi, Vivek Ramavajjala","doi":"10.1145/3159652.3159731","DOIUrl":"https://doi.org/10.1145/3159652.3159731","url":null,"abstract":"Label propagation is a powerful and flexible semi-supervised learning technique on graphs. Neural networks, on the other hand, have proven track records in many supervised learning tasks. In this work, we propose a training framework with a graph-regularised objective, namely Neural Graph Machines, that can combine the power of neural networks and label propagation. This work generalises previous literature on graph-augmented training of neural networks, enabling it to be applied to multiple neural architectures (Feed-forward NNs, CNNs and LSTM RNNs) and a wide range of graphs. The new objective allows the neural networks to harness both labeled and unlabeled data by: (a)~allowing the network to train using labeled data as in the supervised setting, (b)~biasing the network to learn similar hidden representations for neighboring nodes on a graph, in the same vein as label propagation. Such architectures with the proposed objective can be trained efficiently using stochastic gradient descent and scaled to large graphs, with a runtime that is linear in the number of edges. The proposed joint training approach convincingly outperforms many existing methods on a wide range of tasks (multi-label classification on social graphs, news categorization, document classification and semantic intent classification), with multiple forms of graph inputs (including graphs with and without node-level features) and using different types of neural networks.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130616871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 71

Tutorial on Metrics of User Engagement: Applications to News, Search and E-Commerce 用户粘性指标教程:新闻、搜索和电子商务的应用

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI: 10.1145/3159652.3162010

M. Lalmas, Liangjie Hong

引用次数: 11

Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction 逻辑学:开放域信息提取的统一端到端神经方法

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI: 10.1145/3159652.3159712

Mingming Sun, Xu Li, Xin Wang, M. Fan, Yue Feng, Ping Li

{"title":"Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction","authors":"Mingming Sun, Xu Li, Xin Wang, M. Fan, Yue Feng, Ping Li","doi":"10.1145/3159652.3159712","DOIUrl":"https://doi.org/10.1145/3159652.3159712","url":null,"abstract":"In this paper, we consider the problem of open information extraction (OIE) for extracting entity and relation level intermediate structures from sentences in open-domain. We focus on four types of valuable intermediate structures (Relation, Attribute, Description, and Concept), and propose a unified knowledge expression form, SAOKE, to express them. We publicly release a data set which contains 48,248 sentences and the corresponding facts in the SAOKE format labeled by crowdsourcing. To our knowledge, this is the largest publicly available human labeled data set for open information extraction tasks. Using this labeled SAOKE data set, we train an end-to-end neural model using the sequence-to-sequence paradigm, called Logician, to transform sentences into facts. For each sentence, different to existing algorithms which generally focus on extracting each single fact without concerning other possible facts, Logician performs a global optimization over all possible involved facts, in which facts not only compete with each other to attract the attention of words, but also cooperate to share words. An experimental study on various types of open domain relation extraction tasks reveals the consistent superiority of Logician to other states-of-the-art algorithms. The experiments verify the reasonableness of SAOKE format, the valuableness of SAOKE data set, the effectiveness of the proposed Logician model, and the feasibility of the methodology to apply end-to-end learning paradigm on supervised data sets for the challenging tasks of open information extraction.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124037216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 51

A Unified Processing Paradigm for Interactive Location-based Web Search 交互式基于位置的Web搜索的统一处理范例

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI: 10.1145/3159652.3159667

Sheng Wang, Z. Bao, Shixun Huang, Rui Zhang

{"title":"A Unified Processing Paradigm for Interactive Location-based Web Search","authors":"Sheng Wang, Z. Bao, Shixun Huang, Rui Zhang","doi":"10.1145/3159652.3159667","DOIUrl":"https://doi.org/10.1145/3159652.3159667","url":null,"abstract":"This paper studies the location-based web search and aims to build a unified processing paradigm for two purposes: (1) efficiently support each of the various types of location-based queries (kNN query, top-k spatial-textual query, etc.) on two major forms of geo-tagged data, i.e., spatial point data such as geo-tagged web documents, and spatial trajectory data such as a sequence of geo-tagged travel blogs by a user; (2) support interactive search to provide quick response for a query session, within which a user usually keeps refining her query by either issuing different query types or specifying different constraints (e.g., adding a keyword and/or location, changing the choice of k, etc.) until she finds the desired results. To achieve this goal, we first propose a general Top-k query called Monotone Aggregate Spatial Keyword query-MASK, which is able to cover most types of location-based web search. Next, we develop a unified indexing (called Textual-Grid-Point Inverted Index) and query processing paradigm (called ETAIL Algorithm) to answer a single MASK query efficiently. Furthermore, we extend ETAIL to provide interactive search for multiple queries within one query session, by exploiting the commonality of textual and/or spatial dimension among queries. Last, extensive experiments on four real datasets verify the robustness and efficiency of our approach.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124144196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Identifying Informational vs. Conversational Questions on Community Question Answering Archives 在社区问答档案中识别信息问题与会话问题

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI: 10.1145/3159652.3159733

Ido Guy, Victor Makarenkov, Niva Hazon, L. Rokach, Bracha Shapira

引用次数: 13

Recommendation in Heterogeneous Information Networks Based on Generalized Random Walk Model and Bayesian Personalized Ranking 基于广义随机行走模型和贝叶斯个性化排序的异构信息网络推荐

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI: 10.1145/3159652.3159715

Zhengshen Jiang, Hongzhi Liu, Bin Fu, Zhonghai Wu, Zhang Tao

{"title":"Recommendation in Heterogeneous Information Networks Based on Generalized Random Walk Model and Bayesian Personalized Ranking","authors":"Zhengshen Jiang, Hongzhi Liu, Bin Fu, Zhonghai Wu, Zhang Tao","doi":"10.1145/3159652.3159715","DOIUrl":"https://doi.org/10.1145/3159652.3159715","url":null,"abstract":"Recommendation based on heterogeneous information network(HIN) is attracting more and more attention due to its ability to emulate collaborative filtering, content-based filtering, context-aware recommendation and combinations of any of these recommendation semantics. Random walk based methods are usually used to mine the paths, weigh the paths, and compute the closeness or relevance between two nodes in a HIN. A key for the success of these methods is how to properly set the weights of links in a HIN. In existing methods, the weights of links are mostly set heuristically. In this paper, we propose a Bayesian Personalized Ranking(BPR) based machine learning method, called HeteLearn, to learn the weights of links in a HIN. In order to model user preferences for personalized recommendation, we also propose a generalized random walk with restart model on HINs. We evaluate the proposed method in a personalized recommendation task and a tag recommendation task. Experimental results show that our method performs significantly better than both the traditional collaborative filtering and the state-of-the-art HIN-based recommendation methods.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132484815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 65

A Path-constrained Framework for Discriminating Substitutable and Complementary Products in E-commerce 电子商务中可替代和互补产品区分的路径约束框架

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining Pub Date : 2018-02-02 DOI: 10.1145/3159652.3159710

Zihan Wang, Ziheng Jiang, Z. Ren, Jiliang Tang, Dawei Yin

{"title":"A Path-constrained Framework for Discriminating Substitutable and Complementary Products in E-commerce","authors":"Zihan Wang, Ziheng Jiang, Z. Ren, Jiliang Tang, Dawei Yin","doi":"10.1145/3159652.3159710","DOIUrl":"https://doi.org/10.1145/3159652.3159710","url":null,"abstract":"In personalized recommendation, candidate generation plays an infrastructural role by retrieving candidates out of billions of items. During this process, substitutes and complements constitute two main classes of retrieved candidates: substitutable products are interchangeable, whereas complementary products might be purchased together by users. Discriminating substitutable and complementary products is playing an increasingly important role in e-commerce portals by affecting the performance of candidate generation, e.g., when a user has browsed a t-shirt, it is reasonable to retrieve similar t-shirts, i.e., substitutes; whereas if the user has already purchased one, it would be better to retrieve trousers, hats or shoes, as complements of t-shirts. In this paper, we propose a path-constrained framework (PMSC) for discriminating substitutes and complements. Specifically, for each product, we first learn its embedding representations in a general semantic space. Thereafter, we project the embedding vectors into two separate spaces via a novel mapping function. In the end, we incorporate each embedding with path-constraints to further boost the discriminative ability of the model. Extensive experiments conducted on two e-commerce datasets show the effectiveness of our proposed method.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131562153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 79