Proceedings of the 25th ACM International on Conference on Information and Knowledge Management最新文献_第10页

On Transductive Classification in Heterogeneous Information Networks 异构信息网络中的转换分类研究

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pub Date : 2016-10-24 DOI: 10.1145/2983323.2983730

Xiang Li, B. Kao, Yudian Zheng, Zhipeng Huang

{"title":"On Transductive Classification in Heterogeneous Information Networks","authors":"Xiang Li, B. Kao, Yudian Zheng, Zhipeng Huang","doi":"10.1145/2983323.2983730","DOIUrl":"https://doi.org/10.1145/2983323.2983730","url":null,"abstract":"A heterogeneous information network (HIN) is used to model objects of different types and their relationships. Objects are often associated with properties such as labels. In many applications, such as curated knowledge bases for which object labels are manually given, only a small fraction of the objects are labeled. Studies have shown that transductive classification is an effective way to classify and to deduce labels of objects, and a number of transductive classifiers have been put forward to classify objects in an HIN. We study the performance of a few representative transductive classification algorithms on HINs. We identify two fundamental properties, namely, cohesiveness and connectedness, of an HIN that greatly influence the effectiveness of transductive classifiers. We define metrics that measure the two properties. Through experiments, we show that the two properties serve as very effective indicators that predict the accuracy of transductive classifiers. Based on cohesiveness and connectedness we derive (1) a black-box tester that evaluates whether transductive classifiers should be applied for a given classification task and (2) an active learning algorithm that identifies the objects in an HIN whose labels should be sought in order to improve classification accuracy.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"29 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120836073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Hashtag Recommendation for Enterprise Applications 企业应用程序的标签推荐

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pub Date : 2016-10-24 DOI: 10.1145/2983323.2983365

D. Mahajan, Vishwajit Kolathur, Chetan Bansal, Suresh Parthasarathy, Sundararajan Sellamanickam, S. Keerthi, J. Gehrke

{"title":"Hashtag Recommendation for Enterprise Applications","authors":"D. Mahajan, Vishwajit Kolathur, Chetan Bansal, Suresh Parthasarathy, Sundararajan Sellamanickam, S. Keerthi, J. Gehrke","doi":"10.1145/2983323.2983365","DOIUrl":"https://doi.org/10.1145/2983323.2983365","url":null,"abstract":"Hashtags have been popularly used in several social cum consumer network settings such as Twitter and Facebook. In this paper, we consider the problem of recommending hashtags for enterprise applications. These applications include emails (e.g., Outlook), enterprise social networks (e.g., Yammer) and special interest group email lists. This problem arises in an organization setting and hashtags are enterprise domain specific. One important aspect of our recommendation system is that we recommend hashtags for Inline hashtag scenario where recommendations change as the user inserts hashtags while typing the message. This involves working with partial content information. Besides this, we consider the conventional Post} hashtagging scenario where hashtags are recommended for the full message. We also consider an important (sub)scenario, viz., Auto-complete where hashtags are recommended with user provided partial information such as sub-string present in the hashtag. Auto-complete can be used with both Inline and Post scenarios. To the best of our knowledge, Inline, Auto-complete hashtag recommendations and hashtagging in enterprise applications have not been studied before. We propose to learn a joint model that uses features of three types, namely, temporal, structural and content. Our learning formulation handles all the hashtagging scenarios naturally. Comprehensive experimental study on five datasets of user email accounts collected by running an Outlook plugin (a key requirement for large scale industrial deployment), one dataset of special interest group email list and one enterprise social network data set shows that the proposed method performs significantly better than the state of the art methods used in consumer applications such as Twitter. The primary reason is that different feature types play dominant role in different scenarios and datasets. Since the joint model makes use of all feature types effectively, it performs better in almost all scenarios and datasets.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121172693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Finding News Citations for Wikipedia 查找维基百科的新闻引文

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pub Date : 2016-10-24 DOI: 10.1145/2983323.2983808

B. Fetahu, K. Markert, W. Nejdl, Avishek Anand

{"title":"Finding News Citations for Wikipedia","authors":"B. Fetahu, K. Markert, W. Nejdl, Avishek Anand","doi":"10.1145/2983323.2983808","DOIUrl":"https://doi.org/10.1145/2983323.2983808","url":null,"abstract":"An important editing policy in Wikipedia is to provide citations for added statements in Wikipedia pages, where statements can be arbitrary pieces of text, ranging from a sentence to a paragraph. In many cases citations are either outdated or missing altogether. In this work we address the problem of finding and updating news citations for statements in entity pages. We propose a two-stage supervised approach for this problem. In the first step, we construct a classifier to find out whether statements need a news citation or other kinds of citations (web, book, journal, etc.). In the second step, we develop a news citation algorithm for Wikipedia statements, which recommends appropriate citations from a given news collection. Apart from IR techniques that use the statement to query the news collection, we also formalize three properties of an appropriate citation, namely: (i) the citation should entail the Wikipedia statement, (ii) the statement should be central to the citation, and (iii) the citation should be from an authoritative source. We perform an extensive evaluation of both steps, using 20 million articles from a real-world news collection. Our results are quite promising, and show that we can perform this task with high precision and at scale.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116443429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 29

Online Food Recipe Title Semantics: Combining Nutrient Facts and Topics 在线食品配方标题语义:结合营养事实和主题

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pub Date : 2016-10-24 DOI: 10.1145/2983323.2983897

T. Kusmierczyk, K. Nørvåg

引用次数: 17

Large-scale Robust Online Matching and Its Application in E-commerce 大规模鲁棒在线匹配及其在电子商务中的应用

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pub Date : 2016-10-24 DOI: 10.1145/2983323.2983370

Rong Jin

{"title":"Large-scale Robust Online Matching and Its Application in E-commerce","authors":"Rong Jin","doi":"10.1145/2983323.2983370","DOIUrl":"https://doi.org/10.1145/2983323.2983370","url":null,"abstract":"This talk will be focused on large-scale matching problem that aims to find the optimal assignment of tasks to different agents under linear constraints. Large-scale matching has found numerous applications in e-commerce. An well known example is budget aware online advertisement. A common practice in online advertisement is to find, for each opportunity or user, the advertisements that fit best with his/her interests. The main shortcoming with this greedy approach is that it did not take into account the budget limits set by advertisers. Our studies, as well as others, have shown that by carefully taking into budget limits of individual advertisers, we could significantly improve the performance of the advertisement system. Despite of rich literature, two important issues are often overlooked in the previous studies of matching/assignment problem. The first issues arises from the fact that most quantities used by optimization are estimated based on historical data and therefore are likely to be inaccurate and unreliable. The second challenge is how to perform online matching as in many e-commerce problems, tasks are created in an online fashion and algorithm has to make assignment decision immediately when every task emerges. We refer to these two issues as challenges of \"robust matching\" and \"online matching\". To address the first challenge, I will introduce two different techniques for robust matching. The first approach is based on the theory of robust optimization that takes into account the uncertainties of estimated quantities when performing optimization. The second approach is based on the theory of two-sided matching whose result only depends on the partial preference of estimated quantities. To deal with the challenge of online matching, I will discuss two online optimization techniques, one based on theory of primal-dual online optimization and one based on minimizing dynamic regret under long term constraints. We verify the effectiveness of all these approaches by applying them to real-world projects developed in Alibaba.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121387503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hashtag Recommendation Based on Topic Enhanced Embedding, Tweet Entity Data and Learning to Rank 基于主题增强嵌入、Tweet实体数据和学习排序的标签推荐

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pub Date : 2016-10-24 DOI: 10.1145/2983323.2983915

Quanzhi Li, Sameena Shah, Armineh Nourbakhsh, Xiaomo Liu, Rui Fang

引用次数: 26

Regularizing Structured Classifier with Conditional Probabilistic Constraints for Semi-supervised Learning 基于条件概率约束的半监督学习正则化结构化分类器

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pub Date : 2016-10-24 DOI: 10.1145/2983323.2983860

V. Zheng, K. Chang

{"title":"Regularizing Structured Classifier with Conditional Probabilistic Constraints for Semi-supervised Learning","authors":"V. Zheng, K. Chang","doi":"10.1145/2983323.2983860","DOIUrl":"https://doi.org/10.1145/2983323.2983860","url":null,"abstract":"Constraints have been shown as an effective way to incorporate unlabeled data for semi-supervised structured classification. We recognize that, constraints are often conditional and probabilistic; moreover, a constraint can have its condition depend on either just observations (which we call x-type constraint) or even hidden variables (which we call y-type constraint). We wish to design a constraint formulation that can flexibly model the constraint probability for both x-type and y-type constraints, and later use it to regularize general structured classifiers for semi-supervision. Surprisingly, none of the existing models have such a constraint formulation. Thus in this paper, we propose a new conditional probabilistic formulation for modeling both x-type and y-type constraints. We also recognize the inference complication for y-type constraint, and propose a systematic selective evaluation approach to efficiently realize the constraints. Finally, we evaluate our model in three applications, including named entity recognition, part-of-speech tagging and entity information extraction, with totally nine data sets. We show that our model is generally more accurate and efficient than the state-of-the-art baselines. Our code and data are available at https://bitbucket.org/vwz/cikm2016-cpf/.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133795972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Personalized Search: Potential and Pitfalls 个性化搜索:潜力和陷阱

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pub Date : 2016-10-24 DOI: 10.1145/2983323.2983367

S. Dumais

引用次数: 14

Attractiveness versus Competition: Towards an Unified Model for User Visitation 吸引力与竞争:用户访问的统一模型

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pub Date : 2016-10-24 DOI: 10.1145/2983323.2983657

Thanh-Nam Doan, Ee-Peng Lim

引用次数: 7

Thymeflow, A Personal Knowledge Base with Spatio-temporal Data Thymeflow，一个具有时空数据的个人知识库

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management Pub Date : 2016-10-24 DOI: 10.1145/2983323.2983337

David Montoya, Thomas Pellissier Tanon, S. Abiteboul, Fabian M. Suchanek

引用次数: 5