Proceedings of the 25th ACM International on Conference on Information and Knowledge Management最新文献

筛选
英文 中文
On Transductive Classification in Heterogeneous Information Networks 异构信息网络中的转换分类研究
Xiang Li, B. Kao, Yudian Zheng, Zhipeng Huang
{"title":"On Transductive Classification in Heterogeneous Information Networks","authors":"Xiang Li, B. Kao, Yudian Zheng, Zhipeng Huang","doi":"10.1145/2983323.2983730","DOIUrl":"https://doi.org/10.1145/2983323.2983730","url":null,"abstract":"A heterogeneous information network (HIN) is used to model objects of different types and their relationships. Objects are often associated with properties such as labels. In many applications, such as curated knowledge bases for which object labels are manually given, only a small fraction of the objects are labeled. Studies have shown that transductive classification is an effective way to classify and to deduce labels of objects, and a number of transductive classifiers have been put forward to classify objects in an HIN. We study the performance of a few representative transductive classification algorithms on HINs. We identify two fundamental properties, namely, cohesiveness and connectedness, of an HIN that greatly influence the effectiveness of transductive classifiers. We define metrics that measure the two properties. Through experiments, we show that the two properties serve as very effective indicators that predict the accuracy of transductive classifiers. Based on cohesiveness and connectedness we derive (1) a black-box tester that evaluates whether transductive classifiers should be applied for a given classification task and (2) an active learning algorithm that identifies the objects in an HIN whose labels should be sought in order to improve classification accuracy.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"29 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120836073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Hashtag Recommendation for Enterprise Applications 企业应用程序的标签推荐
D. Mahajan, Vishwajit Kolathur, Chetan Bansal, Suresh Parthasarathy, Sundararajan Sellamanickam, S. Keerthi, J. Gehrke
{"title":"Hashtag Recommendation for Enterprise Applications","authors":"D. Mahajan, Vishwajit Kolathur, Chetan Bansal, Suresh Parthasarathy, Sundararajan Sellamanickam, S. Keerthi, J. Gehrke","doi":"10.1145/2983323.2983365","DOIUrl":"https://doi.org/10.1145/2983323.2983365","url":null,"abstract":"Hashtags have been popularly used in several social cum consumer network settings such as Twitter and Facebook. In this paper, we consider the problem of recommending hashtags for enterprise applications. These applications include emails (e.g., Outlook), enterprise social networks (e.g., Yammer) and special interest group email lists. This problem arises in an organization setting and hashtags are enterprise domain specific. One important aspect of our recommendation system is that we recommend hashtags for Inline hashtag scenario where recommendations change as the user inserts hashtags while typing the message. This involves working with partial content information. Besides this, we consider the conventional Post} hashtagging scenario where hashtags are recommended for the full message. We also consider an important (sub)scenario, viz., Auto-complete where hashtags are recommended with user provided partial information such as sub-string present in the hashtag. Auto-complete can be used with both Inline and Post scenarios. To the best of our knowledge, Inline, Auto-complete hashtag recommendations and hashtagging in enterprise applications have not been studied before. We propose to learn a joint model that uses features of three types, namely, temporal, structural and content. Our learning formulation handles all the hashtagging scenarios naturally. Comprehensive experimental study on five datasets of user email accounts collected by running an Outlook plugin (a key requirement for large scale industrial deployment), one dataset of special interest group email list and one enterprise social network data set shows that the proposed method performs significantly better than the state of the art methods used in consumer applications such as Twitter. The primary reason is that different feature types play dominant role in different scenarios and datasets. Since the joint model makes use of all feature types effectively, it performs better in almost all scenarios and datasets.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121172693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Finding News Citations for Wikipedia 查找维基百科的新闻引文
B. Fetahu, K. Markert, W. Nejdl, Avishek Anand
{"title":"Finding News Citations for Wikipedia","authors":"B. Fetahu, K. Markert, W. Nejdl, Avishek Anand","doi":"10.1145/2983323.2983808","DOIUrl":"https://doi.org/10.1145/2983323.2983808","url":null,"abstract":"An important editing policy in Wikipedia is to provide citations for added statements in Wikipedia pages, where statements can be arbitrary pieces of text, ranging from a sentence to a paragraph. In many cases citations are either outdated or missing altogether. In this work we address the problem of finding and updating news citations for statements in entity pages. We propose a two-stage supervised approach for this problem. In the first step, we construct a classifier to find out whether statements need a news citation or other kinds of citations (web, book, journal, etc.). In the second step, we develop a news citation algorithm for Wikipedia statements, which recommends appropriate citations from a given news collection. Apart from IR techniques that use the statement to query the news collection, we also formalize three properties of an appropriate citation, namely: (i) the citation should entail the Wikipedia statement, (ii) the statement should be central to the citation, and (iii) the citation should be from an authoritative source. We perform an extensive evaluation of both steps, using 20 million articles from a real-world news collection. Our results are quite promising, and show that we can perform this task with high precision and at scale.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116443429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Online Food Recipe Title Semantics: Combining Nutrient Facts and Topics 在线食品配方标题语义:结合营养事实和主题
T. Kusmierczyk, K. Nørvåg
{"title":"Online Food Recipe Title Semantics: Combining Nutrient Facts and Topics","authors":"T. Kusmierczyk, K. Nørvåg","doi":"10.1145/2983323.2983897","DOIUrl":"https://doi.org/10.1145/2983323.2983897","url":null,"abstract":"Dietary pattern analysis is an important research area, and recently the availability of rich resources in food-focused social networks has enabled new opportunities in that field. However, there is a little understanding of how online textual content is related to actual health factors, e.g., nutritional values. To contribute to this lack of knowledge, we present a novel approach to mine and model online food content by combining text topics with related nutrient facts. Our empirical analysis reveals a strong correlation between them and our experiments show the extent to which it is possible to predict nutrient facts from meal name.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116481194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Large-scale Robust Online Matching and Its Application in E-commerce 大规模鲁棒在线匹配及其在电子商务中的应用
Rong Jin
{"title":"Large-scale Robust Online Matching and Its Application in E-commerce","authors":"Rong Jin","doi":"10.1145/2983323.2983370","DOIUrl":"https://doi.org/10.1145/2983323.2983370","url":null,"abstract":"This talk will be focused on large-scale matching problem that aims to find the optimal assignment of tasks to different agents under linear constraints. Large-scale matching has found numerous applications in e-commerce. An well known example is budget aware online advertisement. A common practice in online advertisement is to find, for each opportunity or user, the advertisements that fit best with his/her interests. The main shortcoming with this greedy approach is that it did not take into account the budget limits set by advertisers. Our studies, as well as others, have shown that by carefully taking into budget limits of individual advertisers, we could significantly improve the performance of the advertisement system. Despite of rich literature, two important issues are often overlooked in the previous studies of matching/assignment problem. The first issues arises from the fact that most quantities used by optimization are estimated based on historical data and therefore are likely to be inaccurate and unreliable. The second challenge is how to perform online matching as in many e-commerce problems, tasks are created in an online fashion and algorithm has to make assignment decision immediately when every task emerges. We refer to these two issues as challenges of \"robust matching\" and \"online matching\". To address the first challenge, I will introduce two different techniques for robust matching. The first approach is based on the theory of robust optimization that takes into account the uncertainties of estimated quantities when performing optimization. The second approach is based on the theory of two-sided matching whose result only depends on the partial preference of estimated quantities. To deal with the challenge of online matching, I will discuss two online optimization techniques, one based on theory of primal-dual online optimization and one based on minimizing dynamic regret under long term constraints. We verify the effectiveness of all these approaches by applying them to real-world projects developed in Alibaba.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121387503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hashtag Recommendation Based on Topic Enhanced Embedding, Tweet Entity Data and Learning to Rank 基于主题增强嵌入、Tweet实体数据和学习排序的标签推荐
Quanzhi Li, Sameena Shah, Armineh Nourbakhsh, Xiaomo Liu, Rui Fang
{"title":"Hashtag Recommendation Based on Topic Enhanced Embedding, Tweet Entity Data and Learning to Rank","authors":"Quanzhi Li, Sameena Shah, Armineh Nourbakhsh, Xiaomo Liu, Rui Fang","doi":"10.1145/2983323.2983915","DOIUrl":"https://doi.org/10.1145/2983323.2983915","url":null,"abstract":"In this paper, we present a new approach of recommending hashtags for tweets. It uses Learning to Rank algorithm to incorporate features built from topic enhanced word embeddings, tweet entity data, hashtag frequency, hashtag temporal data and tweet URL domain information. The experiments using millions of tweets and hashtags show that the proposed approach outperforms the three baseline methods -- the LDA topic, the tf.idf based and the general word embedding approaches.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132769715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Regularizing Structured Classifier with Conditional Probabilistic Constraints for Semi-supervised Learning 基于条件概率约束的半监督学习正则化结构化分类器
V. Zheng, K. Chang
{"title":"Regularizing Structured Classifier with Conditional Probabilistic Constraints for Semi-supervised Learning","authors":"V. Zheng, K. Chang","doi":"10.1145/2983323.2983860","DOIUrl":"https://doi.org/10.1145/2983323.2983860","url":null,"abstract":"Constraints have been shown as an effective way to incorporate unlabeled data for semi-supervised structured classification. We recognize that, constraints are often conditional and probabilistic; moreover, a constraint can have its condition depend on either just observations (which we call x-type constraint) or even hidden variables (which we call y-type constraint). We wish to design a constraint formulation that can flexibly model the constraint probability for both x-type and y-type constraints, and later use it to regularize general structured classifiers for semi-supervision. Surprisingly, none of the existing models have such a constraint formulation. Thus in this paper, we propose a new conditional probabilistic formulation for modeling both x-type and y-type constraints. We also recognize the inference complication for y-type constraint, and propose a systematic selective evaluation approach to efficiently realize the constraints. Finally, we evaluate our model in three applications, including named entity recognition, part-of-speech tagging and entity information extraction, with totally nine data sets. We show that our model is generally more accurate and efficient than the state-of-the-art baselines. Our code and data are available at https://bitbucket.org/vwz/cikm2016-cpf/.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133795972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Personalized Search: Potential and Pitfalls 个性化搜索:潜力和陷阱
S. Dumais
{"title":"Personalized Search: Potential and Pitfalls","authors":"S. Dumais","doi":"10.1145/2983323.2983367","DOIUrl":"https://doi.org/10.1145/2983323.2983367","url":null,"abstract":"Traditionally search engines have returned the same results to everyone who asks the same question. However, using a single ranking for everyone in every context at every point in time limits how well a search engine can do in providing relevant information. In this talk I present a framework to quantify the \"potential for personalization\" which we use to characterize the extent to which different people have different intents for the same query. I describe several examples of how we represent and use different kinds of contextual features to improve search quality for individuals and groups. Finally, I conclude by highlighting important challenges in developing personalized systems at Web scale including privacy, transparency, serendipity, and evaluation.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"521 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116571403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Attractiveness versus Competition: Towards an Unified Model for User Visitation 吸引力与竞争:用户访问的统一模型
Thanh-Nam Doan, Ee-Peng Lim
{"title":"Attractiveness versus Competition: Towards an Unified Model for User Visitation","authors":"Thanh-Nam Doan, Ee-Peng Lim","doi":"10.1145/2983323.2983657","DOIUrl":"https://doi.org/10.1145/2983323.2983657","url":null,"abstract":"Modeling user check-in behavior provides useful insights about venues as well as the users visiting them. These insights can be used in urban planning and recommender system applications. Unlike previous works that focus on modeling distance effect on user's choice of check-in venues, this paper studies check-in behaviors affected by two venue-related factors, namely, area attractiveness and neighborhood competitiveness. The former refers to the ability of an area with multiple venues to collectively attract check-ins from users, while the latter represents the ability of a venue to compete with its neighbors in the same area for check-ins. We first embark on a data science study to ascertain the two factors using two Foursquare datasets gathered from users and venues in Singapore and Jakarta, two major cities in Asia. We then propose the VAN model incorporating user-venue distance, area attractiveness and neighborhood competitiveness factors. The results from real datasets show that VAN model outperforms the various baselines in two tasks: home location prediction and check-in prediction.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116836378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Thymeflow, A Personal Knowledge Base with Spatio-temporal Data Thymeflow,一个具有时空数据的个人知识库
David Montoya, Thomas Pellissier Tanon, S. Abiteboul, Fabian M. Suchanek
{"title":"Thymeflow, A Personal Knowledge Base with Spatio-temporal Data","authors":"David Montoya, Thomas Pellissier Tanon, S. Abiteboul, Fabian M. Suchanek","doi":"10.1145/2983323.2983337","DOIUrl":"https://doi.org/10.1145/2983323.2983337","url":null,"abstract":"The typical Internet user has data spread over several devices and across several online systems. We demonstrate an open-source system for integrating user's data from different sources into a single Knowledge Base. Our system integrates data of different kinds into a coherent whole, starting with email messages, calendar, contacts, and location history. It is able to detect event periods in the user's location data and align them with calendar events. We will demonstrate how to query the system within and across different dimensions, and perform analytics over emails, events, and locations.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133294307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信