Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining最新文献

筛选
英文 中文
Fake News Research: Theories, Detection Strategies, and Open Problems 假新闻研究:理论、检测策略和开放性问题
R. Zafarani, Xinyi Zhou, Kai Shu, Huan Liu
{"title":"Fake News Research: Theories, Detection Strategies, and Open Problems","authors":"R. Zafarani, Xinyi Zhou, Kai Shu, Huan Liu","doi":"10.1145/3292500.3332287","DOIUrl":"https://doi.org/10.1145/3292500.3332287","url":null,"abstract":"Fake news has become a global phenomenon due its explosive growth, particularly on social media. The goal of this tutorial is to (1) clearly introduce the concept and characteristics of fake news and how it can be formally differentiated from other similar concepts such as mis-/dis-information, satire news, rumors, among others, which helps deepen the understanding of fake news; (2) provide a comprehensive review of fundamental theories across disciplines and illustrate how they can be used to conduct interdisciplinary fake news research, facilitating a concerted effort of experts in computer and information science, political science, journalism, social science, psychology and economics. Such concerted efforts can result in highly efficient and explainable fake news detection; (3) systematically present fake news detection strategies from four perspectives (i.e., knowledge, style, propagation, and credibility) and the ways that each perspective utilizes techniques developed in data/graph mining, machine learning, natural language processing, and information retrieval; and (4) detail open issues within current fake news studies to reveal great potential research opportunities, hoping to attract researchers within a broader area to work on fake news detection and further facilitate its development. The tutorial aims to promote a fair, healthy and safe online information and news dissemination ecosystem, hoping to attract more researchers, engineers and students with various interests to fake news research. Few prerequisite are required for KDD participants to attend.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121472196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
State-Sharing Sparse Hidden Markov Models for Personalized Sequences 个性化序列的状态共享稀疏隐马尔可夫模型
Hongzhi Shi, Chao Zhang, Quanming Yao, Yong Li, Funing Sun, Depeng Jin
{"title":"State-Sharing Sparse Hidden Markov Models for Personalized Sequences","authors":"Hongzhi Shi, Chao Zhang, Quanming Yao, Yong Li, Funing Sun, Depeng Jin","doi":"10.1145/3292500.3330828","DOIUrl":"https://doi.org/10.1145/3292500.3330828","url":null,"abstract":"Hidden Markov Model (HMM) is a powerful tool that has been widely adopted in sequence modeling tasks, such as mobility analysis, healthcare informatics, and online recommendation. However, using HMM for modeling personalized sequences remains a challenging problem: training a unified HMM with all the sequences often fails to uncover interesting personalized patterns; yet training one HMM for each individual inevitably suffers from data scarcity. We address this challenge by proposing a state-sharing sparse hidden Markov model (S3HMM) that can uncover personalized sequential patterns without suffering from data scarcity. This is achieved by two design principles: (1) all the HMMs in the ensemble share the same set of latent states; and (2) each HMM has its own transition matrix to model the personalized transitions. The result optimization problem for S3HMM becomes nontrivial, because of its two-layer hidden state design and the non-convexity in parameter estimation. We design a new Expectation-Maximization algorithm based, which treats the difference of convex programming as a sub-solver to optimize the non-convex function in the M-step with convergence guarantee. Our experimental results show that, S3HMM can successfully uncover personalized sequential patterns in various applications and outperforms baselines significantly in downstream prediction tasks.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114159633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Buying or Browsing?: Predicting Real-time Purchasing Intent using Attention-based Deep Network with Multiple Behavior 购买还是浏览?基于多重行为的基于注意力的深度网络预测实时购买意图
Long Guo, Lifeng Hua, Rongfei Jia, Binqiang Zhao, Xiaobo Wang, B. Cui
{"title":"Buying or Browsing?: Predicting Real-time Purchasing Intent using Attention-based Deep Network with Multiple Behavior","authors":"Long Guo, Lifeng Hua, Rongfei Jia, Binqiang Zhao, Xiaobo Wang, B. Cui","doi":"10.1145/3292500.3330670","DOIUrl":"https://doi.org/10.1145/3292500.3330670","url":null,"abstract":"E-commerce platforms are becoming a primary place for people to find, compare and ultimately purchase products. One of the fundamental questions that arises in e-commerce is to predict user purchasing intent, which is an important part of user understanding and allows for providing better services for both sellers and customers. However, previous work cannot predict real-time user purchasing intent with a high accuracy, limited by the representation capability of traditional browse-interactive behavior adopted. In this paper, we propose a novel end-to-end deep network, named Deep Intent Prediction Network (DIPN), to predict real-time user purchasing intent. In particular, besides the traditional browse-interactive behavior, we collect a new type of user interactive behavior, called touch-interactive behavior, which can capture more fine-grained real-time user features. To combine these behavior effectively, we propose a hierarchical attention mechanism, where the bottom attention layer focuses on the inner parts of each behavior sequence while the top attention layer learns the inter-view relations between different behavior sequences. In addition, we propose to train DIPN with multi-task learning to better distinguish user behavior patterns. In the experiments conducted on a large-scale industrial dataset, DIPN significantly outperforms the baseline solutions. Notably, DIPN gains about 18.96% improvement on AUC than the state-of-the-art solution only using traditional browse-interactive behavior sequences. Moreover, DIPN has been deployed in the operational system of Taobao. Online A/B testing results with more than 12.9 millions of users reveal the potential of knowing users' real-time purchasing intent.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121673662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 67
Learning Sleep Quality from Daily Logs 从日常日志中学习睡眠质量
S. Park, Cheng-te Li, Sungwon Han, Cheng-Mao Hsu, Sang Won Lee, M. Cha
{"title":"Learning Sleep Quality from Daily Logs","authors":"S. Park, Cheng-te Li, Sungwon Han, Cheng-Mao Hsu, Sang Won Lee, M. Cha","doi":"10.1145/3292500.3330792","DOIUrl":"https://doi.org/10.1145/3292500.3330792","url":null,"abstract":"Precision psychiatry is a new research field that uses advanced data mining over a wide range of neural, behavioral, psychological, and physiological data sources for classification of mental health conditions. This study presents a computational framework for predicting sleep efficiency of insomnia sufferers. A smart band experiment is conducted to collect heterogeneous data, including sleep records, daily activities, and demographics, whose missing values are imputed via Improved Generative Adversarial Imputation Networks (Imp-GAIN). Equipped with the imputed data, we predict sleep efficiency of individual users with a proposed interpretable LSTM-Attention (LA Block) neural network model. We also propose a model, Pairwise Learning-based Ranking Generation (PLRG), to rank users with high insomnia potential in the next day. We discuss implications of our findings from the perspective of a psychiatric practitioner. Our computational framework can be used for other applications that analyze and handle noisy and incomplete time-series human activity data in the domain of precision psychiatry.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121813647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network 基于随机递归神经网络的多元时间序列鲁棒异常检测
Ya Su, Youjian Zhao, Chenhao Niu, Rong Liu, Wei Sun, Dan Pei
{"title":"Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network","authors":"Ya Su, Youjian Zhao, Chenhao Niu, Rong Liu, Wei Sun, Dan Pei","doi":"10.1145/3292500.3330672","DOIUrl":"https://doi.org/10.1145/3292500.3330672","url":null,"abstract":"Industry devices (i.e., entities) such as server machines, spacecrafts, engines, etc., are typically monitored with multivariate time series, whose anomaly detection is critical for an entity's service quality management. However, due to the complex temporal dependence and stochasticity of multivariate time series, their anomaly detection remains a big challenge. This paper proposes OmniAnomaly, a stochastic recurrent neural network for multivariate time series anomaly detection that works well robustly for various devices. Its core idea is to capture the normal patterns of multivariate time series by learning their robust representations with key techniques such as stochastic variable connection and planar normalizing flow, reconstruct input data by the representations, and use the reconstruction probabilities to determine anomalies. Moreover, for a detected entity anomaly, OmniAnomaly can provide interpretations based on the reconstruction probabilities of its constituent univariate time series. The evaluation experiments are conducted on two public datasets from aerospace and a new server machine dataset (collected and released by us) from an Internet company. OmniAnomaly achieves an overall F1-Score of 0.86 in three real-world datasets, signicantly outperforming the best performing baseline method by 0.09. The interpretation accuracy for OmniAnomaly is up to 0.89.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123859958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 584
Deep Bayesian Mining, Learning and Understanding 深度贝叶斯挖掘,学习和理解
Jen-Tzung Chien
{"title":"Deep Bayesian Mining, Learning and Understanding","authors":"Jen-Tzung Chien","doi":"10.1145/3292500.3332267","DOIUrl":"https://doi.org/10.1145/3292500.3332267","url":null,"abstract":"This tutorial addresses the advances in deep Bayesian mining and learning for natural language with ubiquitous applications ranging from speech recognition to document summarization, text classification, text segmentation, information extraction, image caption generation, sentence generation, dialogue control, sentiment classification, recommendation system, question answering and machine translation, to name a few. Traditionally, \"deep learning\" is taken to be a learning process where the inference or optimization is based on the real-valued deterministic model. The \"semantic structure\" in words, sentences, entities, actions and documents drawn from a large vocabulary may not be well expressed or correctly optimized in mathematical logic or computer programs. The \"distribution function\" in discrete or continuous latent variable model for natural language may not be properly decomposed or estimated. This tutorial addresses the fundamentals of statistical models and neural networks, and focus on a series of advanced Bayesian models and deep models including hierarchical Dirichlet process, Chinese restaurant process, hierarchical Pitman-Yor process, Indian buffet process, recurrent neural network, long short-term memory, sequence-to-sequence model, variational auto-encoder, generative adversarial network, attention mechanism, memory-augmented neural network, skip neural network, stochastic neural network, predictive state neural network, policy neural network. We present how these models are connected and why they work for a variety of applications on symbolic and complex patterns in natural language. The variational inference and sampling method are formulated to tackle the optimization for complicated models. The word and sentence embeddings, clustering and co-clustering are merged with linguistic and semantic constraints. A series of case studies are presented to tackle different issues in deep Bayesian mining, learning and understanding. At last, we will point out a number of directions and outlooks for future studies.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124934251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Revisiting kd-tree for Nearest Neighbor Search 重访kd树进行最近邻搜索
P. Ram, Kaushik Sinha
{"title":"Revisiting kd-tree for Nearest Neighbor Search","authors":"P. Ram, Kaushik Sinha","doi":"10.1145/3292500.3330875","DOIUrl":"https://doi.org/10.1145/3292500.3330875","url":null,"abstract":"kdtree citefriedman1976algorithm has long been deemed unsuitable for exact nearest-neighbor search in high dimensional data. The theoretical guarantees and the empirical performance of kdtree do not show significant improvements over brute-force nearest-neighbor search in moderate to high dimensions. kdtree has been used relatively more successfully for approximate search citemuja2009flann but lack theoretical guarantees. In the article, we build upon randomized-partition trees citedasgupta2013randomized to propose kdtree based approximate search schemes with $O(d łog d + łog n)$ query time for data sets with n points in d dimensions and rigorous theoretical guarantees on the search accuracy. We empirically validate the search accuracy and the query time guarantees of our proposed schemes, demonstrating the significantly improved scaling for same level of accuracy.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125274672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Contextual Fact Ranking and Its Applications in Table Synthesis and Compression 上下文事实排序及其在表合成和压缩中的应用
Silu Huang, Jialu Liu, Flip Korn, Xuezhi Wang, You Wu, Dale Markowitz, Cong Yu
{"title":"Contextual Fact Ranking and Its Applications in Table Synthesis and Compression","authors":"Silu Huang, Jialu Liu, Flip Korn, Xuezhi Wang, You Wu, Dale Markowitz, Cong Yu","doi":"10.1145/3292500.3330980","DOIUrl":"https://doi.org/10.1145/3292500.3330980","url":null,"abstract":"Modern search engines increasingly incorporate tabular content, which consists of a set of entities each augmented with a small set of facts. The facts can be obtained from multiple sources: an entity's knowledge base entry, the infobox on its Wikipedia page, or its row within a WebTable. Crucially, the informativeness of a fact depends not only on the entity but also the specific context(e.g., the query).To the best of our knowledge, this paper is the first to study the problem of contextual fact ranking: given some entities and a context (i.e., succinct natural language description), identify the most informative facts for the entities collectively within the context.We propose to contextually rank the facts by exploiting deep learning techniques. In particular, we develop pointwise and pair-wise ranking models, using textual and statistical information for the given entities and context derived from their sources. We enhance the models by incorporating entity type information from an IsA (hypernym) database. We demonstrate that our approaches achieve better performance than state-of-the-art baselines in terms of MAP, NDCG, and recall. We further conduct user studies for two specific applications of contextual fact ranking-table synthesis and table compression-and show that our models can identify more informative facts than the baselines.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122797703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
MVAN: Multi-view Attention Networks for Real Money Trading Detection in Online Games 基于多视角注意力网络的在线游戏真钱交易检测
Jianrong Tao, Jianshi Lin, Shize Zhang, Sha Zhao, Runze Wu, Changjie Fan, Peng Cui
{"title":"MVAN: Multi-view Attention Networks for Real Money Trading Detection in Online Games","authors":"Jianrong Tao, Jianshi Lin, Shize Zhang, Sha Zhao, Runze Wu, Changjie Fan, Peng Cui","doi":"10.1145/3292500.3330687","DOIUrl":"https://doi.org/10.1145/3292500.3330687","url":null,"abstract":"Online gaming is a multi-billion dollar industry that entertains a large, global population. However, one unfortunate phenomenon known as real money trading harms the competition and the fun. Real money trading is an interesting economic activity used to exchange assets in a virtual world with real world currencies, leading to imbalance of game economy and inequality of wealth and opportunity. Game operation teams have been devoting much efforts on real money trading detection, however, it still remains a challenging task. To overcome the limitation from traditional methods conducted by game operation teams, we propose, MVAN, the first multi-view attention networks for detecting real money trading with multi-view data sources. We present a multi-graph attention network (MGAT) in the graph structure view, a behavior attention network (BAN) in the vertex content view, a portrait attention network (PAN) in the vertex attribute view and a data source attention network (DSAN) in the data source view. Experiments conducted on real-world game logs from a commercial NetEase MMORPG( JusticePC) show that our method consistently performs promising results compared with other competitive methods over time and verifiy the importance and rationality of attention mechanisms. MVAN is deployed to several MMORPGs in NetEase in practice and achieving remarkable performance improvement and acceleration. Our method can easily generalize to other types of related tasks in real world, such as fraud detection, drug tracking and money laundering tracking etc.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122908984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Tensorized Determinantal Point Processes for Recommendation 用于推荐的张张化行列式点过程
Romain Warlop, Jérémie Mary, Mike Gartrell
{"title":"Tensorized Determinantal Point Processes for Recommendation","authors":"Romain Warlop, Jérémie Mary, Mike Gartrell","doi":"10.1145/3292500.3330952","DOIUrl":"https://doi.org/10.1145/3292500.3330952","url":null,"abstract":"Interest in determinantal point processes (DPPs) is increasing in machine learning due to their ability to provide an elegant parametric model over combinatorial sets. In particular, the number of required parameters in a DPP grows only quadratically with the size of the ground set (e.g., item catalog), while the number of possible sets of items grows exponentially. Recent work has shown that DPPs can be effective models for product recommendation and basket completion tasks, since they are able to account for both the diversity and quality of items within a set. We present an enhanced DPP model that is specialized for the task of basket completion, the tensorized DPP. We leverage ideas from tensor factorization in order to customize the model for the next-item basket completion task, where the next item is captured in an extra dimension of the model. We evaluate our model on several real-world datasets, and find that the tensorized DPP provides significantly better predictive quality in several settings than a number of state-of-the art models.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"477 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127820719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信