Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining最新文献

筛选
英文 中文
Cognitive Biases in Crowdsourcing 众包中的认知偏差
Carsten Eickhoff
{"title":"Cognitive Biases in Crowdsourcing","authors":"Carsten Eickhoff","doi":"10.1145/3159652.3159654","DOIUrl":"https://doi.org/10.1145/3159652.3159654","url":null,"abstract":"Crowdsourcing has become a popular paradigm in data curation, annotation and evaluation for many artificial intelligence and information retrieval applications. Considerable efforts have gone into devising effective quality control mechanisms that identify or discourage cheat submissions in an attempt to improve the quality of noisy crowd judgments. Besides purposeful cheating, there is another source of noise that is often alluded to but insufficiently studied: Cognitive biases. This paper investigates the prevalence and effect size of a range of common cognitive biases on a standard relevance judgment task. Our experiments are based on three sizable publicly available document collections and note significant detrimental effects on annotation quality, system ranking and the performance of derived rankers when task design does not account for such biases.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125471696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 112
Performance Analysis of a Privacy Constrained kNN Recommendation Using Data Sketches 基于数据草图的隐私约束kNN推荐性能分析
A. Afsharinejad, N. Hurley
{"title":"Performance Analysis of a Privacy Constrained kNN Recommendation Using Data Sketches","authors":"A. Afsharinejad, N. Hurley","doi":"10.1145/3159652.3159673","DOIUrl":"https://doi.org/10.1145/3159652.3159673","url":null,"abstract":"This paper evaluates two algorithms, BLIP and JLT, for creating differentially private data sketches of user profiles, in terms of their ability to protect a kNN collaborative filtering algorithm from an inference attack by third-parties. The transformed user profiles are employed in a user-based top-N collaborative filtering system. For the first time, a theoretical analysis of the BLIP is carried out, to derive expressions that relate its parameters to its performance. This allows the two techniques to be fairly compared. The impact of deploying these approaches on the utility of the system---its ability to make good recommendations, and on its privacy level---the ability of third-parties to make inferences about the underlying user preferences, is examined. An active inference attack is evaluated, that consists of the injection of a number of tailored sybil profiles into the system database. User profile data of targeted users is then inferred from the recommendations made to the sybils. Although the differentially private sketches are designed to allow the transformed user profiles to be published without compromising privacy, the attack we examine does not use such information and depends only on some pre-existing knowledge of some user preferences as well as the neighbourhood size of the kNN algorithm. Our analysis therefore assesses in practical terms a relatively weak privacy attack, which is extremely simple to apply in systems that allow low-cost generation of sybils. We find that, for a given differential privacy level, the BLIP injects less noise into the system, but for a given level of noise, the JLT offers a more compact representation.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115530078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Mining Twitter for Fine-Grained Political Opinion Polarity Classification, Ideology Detection and Sarcasm Detection 挖掘Twitter的细粒度政治观点极性分类、意识形态检测和讽刺检测
Sandeepa Kannangara
{"title":"Mining Twitter for Fine-Grained Political Opinion Polarity Classification, Ideology Detection and Sarcasm Detection","authors":"Sandeepa Kannangara","doi":"10.1145/3159652.3170461","DOIUrl":"https://doi.org/10.1145/3159652.3170461","url":null,"abstract":"In this paper, we propose three models for socio-political opinion polarity classification of microblog posts. Firstly, a novel probabilistic model, Joint-Entity-Sentiment-Topic (JEST) model, which captures opinions as a combination of the target entity, sentiment and topic, will be proposed. Secondly, a model for ideology detection called JEST-Ideology will be proposed to identify an individual»s orientation towards topics/issues and target entities by extending the proposed opinion polarity classification framework. Finally, we propose a novel method to accurately detect sarcastic opinions by utilizing detected fine-grained opinion and ideology.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122376597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Neural Graph Learning: Training Neural Networks Using Graphs 神经图学习:使用图训练神经网络
T. Bui, Sujith Ravi, Vivek Ramavajjala
{"title":"Neural Graph Learning: Training Neural Networks Using Graphs","authors":"T. Bui, Sujith Ravi, Vivek Ramavajjala","doi":"10.1145/3159652.3159731","DOIUrl":"https://doi.org/10.1145/3159652.3159731","url":null,"abstract":"Label propagation is a powerful and flexible semi-supervised learning technique on graphs. Neural networks, on the other hand, have proven track records in many supervised learning tasks. In this work, we propose a training framework with a graph-regularised objective, namely Neural Graph Machines, that can combine the power of neural networks and label propagation. This work generalises previous literature on graph-augmented training of neural networks, enabling it to be applied to multiple neural architectures (Feed-forward NNs, CNNs and LSTM RNNs) and a wide range of graphs. The new objective allows the neural networks to harness both labeled and unlabeled data by: (a)~allowing the network to train using labeled data as in the supervised setting, (b)~biasing the network to learn similar hidden representations for neighboring nodes on a graph, in the same vein as label propagation. Such architectures with the proposed objective can be trained efficiently using stochastic gradient descent and scaled to large graphs, with a runtime that is linear in the number of edges. The proposed joint training approach convincingly outperforms many existing methods on a wide range of tasks (multi-label classification on social graphs, news categorization, document classification and semantic intent classification), with multiple forms of graph inputs (including graphs with and without node-level features) and using different types of neural networks.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130616871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 71
Tutorial on Metrics of User Engagement: Applications to News, Search and E-Commerce 用户粘性指标教程:新闻、搜索和电子商务的应用
M. Lalmas, Liangjie Hong
{"title":"Tutorial on Metrics of User Engagement: Applications to News, Search and E-Commerce","authors":"M. Lalmas, Liangjie Hong","doi":"10.1145/3159652.3162010","DOIUrl":"https://doi.org/10.1145/3159652.3162010","url":null,"abstract":"User engagement plays a central role in companies operating online services, such as search engines, news portals, e-commerce sites, and social networks. A main challenge is to leverage collected knowledge about the daily online behavior of millions of users to understand what engage them short-term and more importantly long-term. The most common way that engagement is measured is through various online metrics, acting as proxy measures of user engagement. This tutorial will review these metrics, their advantages and drawbacks, and their appropriateness to various types of online services. As case studies, we will focus on three types of services, news, search and e-commerce. We will also briefly discuss how to develop better machine learning models to optimize online metrics, and design experiments to test these models.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114838965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction 逻辑学:开放域信息提取的统一端到端神经方法
Mingming Sun, Xu Li, Xin Wang, M. Fan, Yue Feng, Ping Li
{"title":"Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction","authors":"Mingming Sun, Xu Li, Xin Wang, M. Fan, Yue Feng, Ping Li","doi":"10.1145/3159652.3159712","DOIUrl":"https://doi.org/10.1145/3159652.3159712","url":null,"abstract":"In this paper, we consider the problem of open information extraction (OIE) for extracting entity and relation level intermediate structures from sentences in open-domain. We focus on four types of valuable intermediate structures (Relation, Attribute, Description, and Concept), and propose a unified knowledge expression form, SAOKE, to express them. We publicly release a data set which contains 48,248 sentences and the corresponding facts in the SAOKE format labeled by crowdsourcing. To our knowledge, this is the largest publicly available human labeled data set for open information extraction tasks. Using this labeled SAOKE data set, we train an end-to-end neural model using the sequence-to-sequence paradigm, called Logician, to transform sentences into facts. For each sentence, different to existing algorithms which generally focus on extracting each single fact without concerning other possible facts, Logician performs a global optimization over all possible involved facts, in which facts not only compete with each other to attract the attention of words, but also cooperate to share words. An experimental study on various types of open domain relation extraction tasks reveals the consistent superiority of Logician to other states-of-the-art algorithms. The experiments verify the reasonableness of SAOKE format, the valuableness of SAOKE data set, the effectiveness of the proposed Logician model, and the feasibility of the methodology to apply end-to-end learning paradigm on supervised data sets for the challenging tasks of open information extraction.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124037216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
A Unified Processing Paradigm for Interactive Location-based Web Search 交互式基于位置的Web搜索的统一处理范例
Sheng Wang, Z. Bao, Shixun Huang, Rui Zhang
{"title":"A Unified Processing Paradigm for Interactive Location-based Web Search","authors":"Sheng Wang, Z. Bao, Shixun Huang, Rui Zhang","doi":"10.1145/3159652.3159667","DOIUrl":"https://doi.org/10.1145/3159652.3159667","url":null,"abstract":"This paper studies the location-based web search and aims to build a unified processing paradigm for two purposes: (1) efficiently support each of the various types of location-based queries (kNN query, top-k spatial-textual query, etc.) on two major forms of geo-tagged data, i.e., spatial point data such as geo-tagged web documents, and spatial trajectory data such as a sequence of geo-tagged travel blogs by a user; (2) support interactive search to provide quick response for a query session, within which a user usually keeps refining her query by either issuing different query types or specifying different constraints (e.g., adding a keyword and/or location, changing the choice of k, etc.) until she finds the desired results. To achieve this goal, we first propose a general Top-k query called Monotone Aggregate Spatial Keyword query-MASK, which is able to cover most types of location-based web search. Next, we develop a unified indexing (called Textual-Grid-Point Inverted Index) and query processing paradigm (called ETAIL Algorithm) to answer a single MASK query efficiently. Furthermore, we extend ETAIL to provide interactive search for multiple queries within one query session, by exploiting the commonality of textual and/or spatial dimension among queries. Last, extensive experiments on four real datasets verify the robustness and efficiency of our approach.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124144196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Identifying Informational vs. Conversational Questions on Community Question Answering Archives 在社区问答档案中识别信息问题与会话问题
Ido Guy, Victor Makarenkov, Niva Hazon, L. Rokach, Bracha Shapira
{"title":"Identifying Informational vs. Conversational Questions on Community Question Answering Archives","authors":"Ido Guy, Victor Makarenkov, Niva Hazon, L. Rokach, Bracha Shapira","doi":"10.1145/3159652.3159733","DOIUrl":"https://doi.org/10.1145/3159652.3159733","url":null,"abstract":"Questions on community question answering websites usually reflect one of two intents: learning information or starting a conversation. In this paper, we revisit this fundamental classification task of informational versus conversational questions, which was originally introduced and studied in 2009. We use a substantially larger dataset of archived questions from Yahoo Answers, which includes the question»s title, description, answers, and votes. We replicate the original experiments over this dataset, point out the common and different from the original results, and present a broad set of characteristics that distinguish the two question types. We also develop new classifiers that make use of additional data types, advanced machine learning, and a large dataset of unlabeled data, which achieve enhanced performance.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127877990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Recommendation in Heterogeneous Information Networks Based on Generalized Random Walk Model and Bayesian Personalized Ranking 基于广义随机行走模型和贝叶斯个性化排序的异构信息网络推荐
Zhengshen Jiang, Hongzhi Liu, Bin Fu, Zhonghai Wu, Zhang Tao
{"title":"Recommendation in Heterogeneous Information Networks Based on Generalized Random Walk Model and Bayesian Personalized Ranking","authors":"Zhengshen Jiang, Hongzhi Liu, Bin Fu, Zhonghai Wu, Zhang Tao","doi":"10.1145/3159652.3159715","DOIUrl":"https://doi.org/10.1145/3159652.3159715","url":null,"abstract":"Recommendation based on heterogeneous information network(HIN) is attracting more and more attention due to its ability to emulate collaborative filtering, content-based filtering, context-aware recommendation and combinations of any of these recommendation semantics. Random walk based methods are usually used to mine the paths, weigh the paths, and compute the closeness or relevance between two nodes in a HIN. A key for the success of these methods is how to properly set the weights of links in a HIN. In existing methods, the weights of links are mostly set heuristically. In this paper, we propose a Bayesian Personalized Ranking(BPR) based machine learning method, called HeteLearn, to learn the weights of links in a HIN. In order to model user preferences for personalized recommendation, we also propose a generalized random walk with restart model on HINs. We evaluate the proposed method in a personalized recommendation task and a tag recommendation task. Experimental results show that our method performs significantly better than both the traditional collaborative filtering and the state-of-the-art HIN-based recommendation methods.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132484815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 65
A Path-constrained Framework for Discriminating Substitutable and Complementary Products in E-commerce 电子商务中可替代和互补产品区分的路径约束框架
Zihan Wang, Ziheng Jiang, Z. Ren, Jiliang Tang, Dawei Yin
{"title":"A Path-constrained Framework for Discriminating Substitutable and Complementary Products in E-commerce","authors":"Zihan Wang, Ziheng Jiang, Z. Ren, Jiliang Tang, Dawei Yin","doi":"10.1145/3159652.3159710","DOIUrl":"https://doi.org/10.1145/3159652.3159710","url":null,"abstract":"In personalized recommendation, candidate generation plays an infrastructural role by retrieving candidates out of billions of items. During this process, substitutes and complements constitute two main classes of retrieved candidates: substitutable products are interchangeable, whereas complementary products might be purchased together by users. Discriminating substitutable and complementary products is playing an increasingly important role in e-commerce portals by affecting the performance of candidate generation, e.g., when a user has browsed a t-shirt, it is reasonable to retrieve similar t-shirts, i.e., substitutes; whereas if the user has already purchased one, it would be better to retrieve trousers, hats or shoes, as complements of t-shirts. In this paper, we propose a path-constrained framework (PMSC) for discriminating substitutes and complements. Specifically, for each product, we first learn its embedding representations in a general semantic space. Thereafter, we project the embedding vectors into two separate spaces via a novel mapping function. In the end, we incorporate each embedding with path-constraints to further boost the discriminative ability of the model. Extensive experiments conducted on two e-commerce datasets show the effectiveness of our proposed method.","PeriodicalId":401247,"journal":{"name":"Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131562153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 79
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信