Proceedings of the 25th ACM International on Conference on Information and Knowledge Management最新文献

筛选
英文 中文
Webpage Depth-level Dwell Time Prediction 网页深度级停留时间预测
Chong Wang, Achir Kalra, C. Borcea, Yi Chen
{"title":"Webpage Depth-level Dwell Time Prediction","authors":"Chong Wang, Achir Kalra, C. Borcea, Yi Chen","doi":"10.1145/2983323.2983878","DOIUrl":"https://doi.org/10.1145/2983323.2983878","url":null,"abstract":"The amount of time spent by users at specific page depths within webpages, called dwell time, can be used by web publishers to decide where to place online ads and what type of ads to place at different depths within a webpage. This paper presents a model to predict the dwell time for a given \"user, webpage, depth\" triplet based on historic data collected by publishers. Dwell time prediction is difficult due to user behavior variability and data sparsity. We adopt the Factorization Machines model because it is able to capture the interaction between users and webpages, overcome the data sparsity issue, and provide flexibility to add auxiliary information such as the visible area of a user's browser. Experimental results using data from a large web publisher demonstrate that our model outperforms deterministic and regression-based comparison models.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126889939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Leveraging the Implicit Structure within Social Media for Emergent Rumor Detection 利用社交媒体内隐结构进行突发谣言检测
Justin Sampson, Fred Morstatter, Liang Wu, Huan Liu
{"title":"Leveraging the Implicit Structure within Social Media for Emergent Rumor Detection","authors":"Justin Sampson, Fred Morstatter, Liang Wu, Huan Liu","doi":"10.1145/2983323.2983697","DOIUrl":"https://doi.org/10.1145/2983323.2983697","url":null,"abstract":"The automatic and early detection of rumors is of paramount importance as the spread of information with questionable veracity can have devastating consequences. This became starkly apparent when, in early 2013, a compromised Associated Press account issued a tweet claiming that there had been an explosion at the White House. This tweet resulted in a significant drop for the Dow Jones Industrial Average. Most existing work in rumor detection leverages conversation statistics and propagation patterns, however, such patterns tend to emerge slowly requiring a conversation to have a significant number of interactions in order to become eligible for classification. In this work, we propose a method for classifying conversations within their formative stages as well as improving accuracy within mature conversations through the discovery of implicit linkages between conversation fragments. In our experiments, we show that current state-of-the-art rumor classification methods can leverage implicit links to significantly improve the ability to properly classify emergent conversations when very little conversation data is available. Adopting this technique allows rumor detection methods to continue to provide a high degree of classification accuracy on emergent conversations with as few as a single tweet. This improvement virtually eliminates the delay of conversation growth inherent in current rumor classification methods while significantly increasing the number of conversations considered viable for classification.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123332507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 85
A Study of Realtime Summarization Metrics 实时总结度量的研究
Matthew Ekstrand-Abueg, R. McCreadie, Virgil Pavlu, Fernando Diaz
{"title":"A Study of Realtime Summarization Metrics","authors":"Matthew Ekstrand-Abueg, R. McCreadie, Virgil Pavlu, Fernando Diaz","doi":"10.1145/2983323.2983653","DOIUrl":"https://doi.org/10.1145/2983323.2983653","url":null,"abstract":"Unexpected news events, such as natural disasters or other human tragedies, create a large volume of dynamic text data from official news media as well as less formal social media. Automatic real-time text summarization has become an important tool for quickly transforming this overabundance of text into clear, useful information for end-users including affected individuals, crisis responders, and interested third parties. Despite the importance of real-time summarization systems, their evaluation is not well understood as classic methods for text summarization are inappropriate for real-time and streaming conditions. The TREC 2013-2015 Temporal Summarization (TREC-TS) track was one of the first evaluation campaigns to tackle the challenges of real-time summarization evaluation, introducing new metrics, ground-truth generation methodology and dataset. In this paper, we present a study of TREC-TS track evaluation methodology, with the aim of documenting its design, analyzing its effectiveness, as well as identifying improvements and best practices for the evaluation of temporal summarization systems.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121449088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Active Content-Based Crowdsourcing Task Selection 基于活动内容的众包任务选择
P. Bansal, Carsten Eickhoff, Thomas Hofmann
{"title":"Active Content-Based Crowdsourcing Task Selection","authors":"P. Bansal, Carsten Eickhoff, Thomas Hofmann","doi":"10.1145/2983323.2983716","DOIUrl":"https://doi.org/10.1145/2983323.2983716","url":null,"abstract":"Crowdsourcing has long established itself as a viable alternative to corpus annotation by domain experts for tasks such as document relevance assessment. The crowdsourcing process traditionally relies on high degrees of label redundancy in order to mitigate the detrimental effects of individually noisy worker submissions. Such redundancy comes at the cost of increased label volume, and, subsequently, monetary requirements. In practice, especially as the size of datasets increases, this is undesirable. In this paper, we focus on an alternate method that exploits document information instead, to infer relevance labels for unjudged documents. We present an active learning scheme for document selection that aims at maximising the overall relevance label prediction accuracy, for a given budget of available relevance judgements by exploiting system-wide estimates of label variance and mutual information. Our experiments are based on TREC 2011 Crowdsourcing Track data and show that our method is able to achieve state-of-the-art performance while requiring 17% - 25% less budget.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122273705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
CyberSafety 2016: The First International Workshop on Computational Methods in CyberSafety 网络安全2016:第一届网络安全计算方法国际研讨会
Shivakant Mishra, Q. Lv, Richard O. Han, Jeremy Blackburn
{"title":"CyberSafety 2016: The First International Workshop on Computational Methods in CyberSafety","authors":"Shivakant Mishra, Q. Lv, Richard O. Han, Jeremy Blackburn","doi":"10.1145/2983323.2988541","DOIUrl":"https://doi.org/10.1145/2983323.2988541","url":null,"abstract":"The theme of cybersafety is an important emerging research topic on the Internet that manifests itself daily as users navigate the Web and networked applications. Examples of cybersafety issues include cyberbullying, cyberthreats, recruiting minors via Internet services for nefarious purposes, using deceptive means to dupe vulnerable populations, exhibiting misbehaving behaviors such as using profanity or flashing in online video chats, and many others. These issues have a direct negative impact on the social, psychological and in some cases physical well-being of the end users. An important characteristic of these issues is that they fall in a grey legal area, where perpetrators may claim freedom of speech or rights to free expression despite causing harm. The main goal of this inaugural workshop on cybersafety is to bring together the researchers and practitioners from academia, industry, government and research labs working in the area of cybersafety to discuss the unique challenges in addressing various cybersafety issues and to share experiences, solutions, tools, and techniques. The focus is on the detection, prevention and mitigation of various cybersafety issues, as well as education and promoting safe practices. Topics of interest include but are not limited to the following: Cyberbullying in social media, Cyberthreats, coercion, and undue social pressure, Misbehaving users in online video chat services, Trolls in chat rooms, discussion boards and other social media, Deception to shape opinion, such as spinning, Deceptive techniques targeted at vulnerable populations such as the elderly and K-12 minors, Bad actors in social media, Online exposure of inappropriate material to minors, Education and promoting safe practices, and Remedies for preventing or thwarting cybersafety issues.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125724370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Annotating Points of Interest with Geo-tagged Tweets 用地理标记的推文注释兴趣点
Kaiqi Zhao, G. Cong, Aixin Sun
{"title":"Annotating Points of Interest with Geo-tagged Tweets","authors":"Kaiqi Zhao, G. Cong, Aixin Sun","doi":"10.1145/2983323.2983850","DOIUrl":"https://doi.org/10.1145/2983323.2983850","url":null,"abstract":"Microblogging services like Twitter contain abundant of user generated content covering a wide range of topics. Many of the tweets can be associated to real-world entities for providing additional information for the latter. In this paper, we aim to associate tweets that are semantically related to real-world locations or Points of Interest (POIs). Tweets contain dynamic and real-time information while POIs contain relatively static information. The tweets associated with POIs provide complementary information for many applications like opinion mining and POI recommendation; the associated POIs can also be used as POI tags in Twitter. We define the research problem of annotating POIs with tweets and propose a novel supervised Bayesian Model (sBM). The model takes into account the textual, spatial features and user behaviors together with the supervised information of whether a tweet is POI-related. It is able to capture user interests in latent regions for the prediction of whether a tweet is POI-related and the association between the tweet and its most semantically related POI. On tweets and POIs collected for two cities (New York City and Singapore), we demonstrate the effectiveness of our models against baseline methods.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126667026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A Personal Perspective and Retrospective on Web Search Technology 网络搜索技术的个人视角与回顾
A. Broder
{"title":"A Personal Perspective and Retrospective on Web Search Technology","authors":"A. Broder","doi":"10.1145/2983323.2983368","DOIUrl":"https://doi.org/10.1145/2983323.2983368","url":null,"abstract":"This talk is a review of some Web research and predictions that I co-authored over the last two decades: both what turned out gratifyingly right and what turned out embarrassingly wrong. Topics will include near-duplicates, the Web graph, query intent, inverted indices efficiency, and others. While this seems a completely idiosyncratic collection there are in fact concealed connections that offer good clues to the big question: what will happen next?","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114245708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding Sparse Topical Structure of Short Text via Stochastic Variational-Gibbs Inference 利用随机变分-吉布斯推理理解短文本的稀疏主题结构
Tianyi Lin, Siyuan Zhang, Hong Cheng
{"title":"Understanding Sparse Topical Structure of Short Text via Stochastic Variational-Gibbs Inference","authors":"Tianyi Lin, Siyuan Zhang, Hong Cheng","doi":"10.1145/2983323.2983765","DOIUrl":"https://doi.org/10.1145/2983323.2983765","url":null,"abstract":"With the soaring popularity of online social media like Twitter, analyzing short text has emerged as an increasingly important task which is challenging to classical topic models, as topic sparsity exists in short text. Topic sparsity refers to the observation that individual document usually concentrates on several salient topics, which may be rare in entire corpus. Understanding this sparse topical structure of short text has been recognized as the key ingredient for mining user-generated Web content and social medium, which are featured in the form of extremely short posts and discussions. However, the existing sparsity-enhanced topic models all assume over-complicated generative process, which severely limits their scalability and makes them unable to automatically infer the number of topics from data. In this paper, we propose a probabilistic Bayesian topic model, namely Sparse Dirichlet mixture Topic Model (SparseDTM), based on Indian Buffet Process (IBP) prior, and infer our model on the large text corpora through a novel inference procedure called stochastic variational-Gibbs inference. Unlike prior work, the proposed approach is able to achieve exact sparse topical structure of large short text collections, and automatically identify the number of topics with a good balance between completeness and homogeneity of topic coherence. Experiments on different genres of large text corpora demonstrate that our approach outperforms various existing sparse topic models. The improvement is significant on large-scale collections of short text.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"61 9-10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120917715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
FeatureMiner: A Tool for Interactive Feature Selection featuremer:一个交互式功能选择工具
Kewei Cheng, Jundong Li, Huan Liu
{"title":"FeatureMiner: A Tool for Interactive Feature Selection","authors":"Kewei Cheng, Jundong Li, Huan Liu","doi":"10.1145/2983323.2983329","DOIUrl":"https://doi.org/10.1145/2983323.2983329","url":null,"abstract":"The recent popularity of big data has brought immense quantities of high-dimensional data, which presents challenges to traditional data mining tasks due to curse of dimensionality. Feature selection has shown to be effective to prepare these high dimensional data for a variety of learning tasks. To provide easy access to feature selection algorithms, we provide an interactive feature selection tool FeatureMiner based on our recently released feature selection repository scikit-feature. FeatureMiner eases the process of performing feature selection for practitioners by providing an interactive user interface. Meanwhile, it also gives users some practical guidance in finding a suitable feature selection algorithm among many given a specific dataset. In this demonstration, we show (1) How to conduct data preprocessing after loading a dataset; (2) How to apply feature selection algorithms; (3) How to choose a suitable algorithm by visualized performance evaluation.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"14 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120936349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
DI-DAP: An Efficient Disaster Information Delivery and Analysis Platform in Disaster Management 灾害管理中高效的灾害信息传递与分析平台
Tao Li, Wubai Zhou, Chunqiu Zeng, Qing Wang, Qifeng Zhou, Dingding Wang, Jia Xu, Yue Huang, Wentao Wang, Minjing Zhang, Steven Luis, Shu‐Ching Chen, N. Rishe
{"title":"DI-DAP: An Efficient Disaster Information Delivery and Analysis Platform in Disaster Management","authors":"Tao Li, Wubai Zhou, Chunqiu Zeng, Qing Wang, Qifeng Zhou, Dingding Wang, Jia Xu, Yue Huang, Wentao Wang, Minjing Zhang, Steven Luis, Shu‐Ching Chen, N. Rishe","doi":"10.1145/2983323.2983355","DOIUrl":"https://doi.org/10.1145/2983323.2983355","url":null,"abstract":"In disaster management, people are interested in the development and the evolution of the disasters. If they intend to track the information of the disaster, they will be overwhelmed by the large number of disaster-related documents, microblogs, and news, etc. To support disaster management and minimize the loss during the disaster, it is necessary to efficiently and effectively collect, deliver, summarize, and analyze the disaster information, letting people in affected area quickly gain an overview of the disaster situation and improve their situational awareness. To present an integrated solution to address the information explosion problem during the disaster period, we designed and implemented DI-DAP, an efficient and effective disaster information delivery and analysis platform. DI-DAP is an information centric information platform aiming to provide convenient, interactive, and timely disaster information to the users in need. It is composed of three separated but complementary services: Disaster Vertical Search Engine, Disaster Storyline Generation, and Geo-Spatial Data Analysis Portal. These services provide a specific set of functionalities to enable users to consume highly summarized information and allow them to conduct ad-hoc geospatial information retrieval tasks. To support these services, DI-DAP adopts FIU-Miner, a fast, integrated, and user-friendly data analysis platform, which encapsulated all the computation and analysis workflow as well-defined tasks. Moreover, to enable ad-hoc geospatial information retrieval, an advanced query language MapQL is used and the query template engine is integrated. DI-DAP is designed and implemented as a disaster management tool and is currently been exercised as the disaster information platform by more than 100 companies and institutions in South Florida area.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"717 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116127527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信