Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining最新文献_第4页

AutoNE

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330848

Ke Tu, Jianxin Ma, Peng Cui, J. Pei, Wenwu Zhu

{"title":"AutoNE","authors":"Ke Tu, Jianxin Ma, Peng Cui, J. Pei, Wenwu Zhu","doi":"10.1145/3292500.3330848","DOIUrl":"https://doi.org/10.1145/3292500.3330848","url":null,"abstract":"Network embedding (NE) aims to embed the nodes of a network into a vector space, and serves as the bridge between machine learning and network data. Despite their widespread success, NE algorithms typically contain a large number of hyperparameters for preserving the various network properties, which must be carefully tuned in order to achieve satisfactory performance. Though automated machine learning (AutoML) has achieved promising results when applied to many types of data such as images and texts, network data poses great challenges to AutoML and remains largely ignored by the literature of AutoML. The biggest obstacle is the massive scale of real-world networks, along with the coupled node relationships that make any straightforward sampling strategy problematic. In this paper, we propose a novel framework, named AutoNE, to automatically optimize the hyperparameters of a NE algorithm on massive networks. In detail, we employ a multi-start random walk strategy to sample several small sub-networks, perform each trial of configuration selection on the sampled sub-network, and design a meta-leaner to transfer the knowledge about optimal hyperparameters from the sub-networks to the original massive network. The transferred meta-knowledge greatly reduces the number of trials required when predicting the optimal hyperparameters for the original network. Extensive experiments demonstrate that our framework can significantly outperform the existing methods, in that it needs less time and fewer trials to find the optimal hyperparameters.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115693006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Log2Intent Log2Intent

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330889

Zhiqiang Tao, Sheng Li, Zhaowen Wang, Chen Fang, Longqi Yang, Handong Zhao, Y. Fu

{"title":"Log2Intent","authors":"Zhiqiang Tao, Sheng Li, Zhaowen Wang, Chen Fang, Longqi Yang, Handong Zhao, Y. Fu","doi":"10.1145/3292500.3330889","DOIUrl":"https://doi.org/10.1145/3292500.3330889","url":null,"abstract":"Modeling user behavior from unstructured software log-trace data is critical in providing personalized service (emphe.g., cross-platform recommendation). Existing user modeling approaches cannot well handle the long-term temporal information in log data, or produce semantically meaningful results for interpreting user logs. To address these challenges, we propose a Log2Intent framework for interpretable user modeling in this paper. Log2Intent adopts a deep sequential modeling framework that contains a temporal encoder, a semantic encoder and a log action decoder, and it fully captures the long-term temporal information in user sessions. Moreover, to bridge the semantic gap between log-trace data and human language, a recurrent semantics memory unit (RSMU) is proposed to encode the annotation sentences from an auxiliary software tutorial dataset, and the output of RSMU is fed into the semantic encoder of Log2Intent. Comprehensive experiments on a real-world Photoshop log-trace dataset with an auxiliary Photoshop tutorial dataset demonstrate the effectiveness of the proposed Log2Intent framework over the state-of-the-art log-trace user modeling method in three different tasks, including log annotation retrieval, user interest detection and user next action prediction.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124288249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Dynamical Origins of Distribution Functions 分布函数的动力学起源

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330842

Chengxi Zang, Peng Cui, Wenwu Zhu, Fei Wang

{"title":"Dynamical Origins of Distribution Functions","authors":"Chengxi Zang, Peng Cui, Wenwu Zhu, Fei Wang","doi":"10.1145/3292500.3330842","DOIUrl":"https://doi.org/10.1145/3292500.3330842","url":null,"abstract":"Many real-world problems are time-evolving in nature, such as the progression of diseases, the cascading process when a post is broadcasting in a social network, or the changing of climates. The observational data characterizing these complex problems are usually only available at discrete time stamps, this makes the existing research on analyzing these problems mostly based on a cross-sectional analysis. In this paper, we try to model these time-evolving phenomena by a dynamic system and the data sets observed at different time stamps are probability distribution functions generated by such a dynamic system. We propose a theorem which builds a mathematical relationship between a dynamical system modeled by differential equations and the distribution function (or survival function) of the cross-sectional states of this system. We then develop a survival analysis framework to learn the differential equations of a dynamical system from its cross-sectional states. With such a framework, we are able to capture the continuous-time dynamics of an evolutionary system.We validate our framework on both synthetic and real-world data sets. The experimental results show that our framework is able to discover and capture the generative dynamics of various data distributions accurately. Our study can potentially facilitate scientific discoveries of the unknown dynamics of complex systems in the real world.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114405381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Towards ML Engineering with TensorFlow Extended (TFX) 使用TensorFlow Extended (TFX)实现机器学习工程

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3340408

Konstantinos Katsiapis, Kevin Haas

{"title":"Towards ML Engineering with TensorFlow Extended (TFX)","authors":"Konstantinos Katsiapis, Kevin Haas","doi":"10.1145/3292500.3340408","DOIUrl":"https://doi.org/10.1145/3292500.3340408","url":null,"abstract":"The discipline of Software Engineering has evolved over the past 5+ decades to good levels of maturity. This maturity is in fact both a blessing and a necessity, since the modern world largely depends on it. At the same time, the popularity of Machine Learning (ML) has been steadily increasing over the past 2+ decades, and over the last decade ML is being increasingly used for both experimentation and production workloads. It is no longer uncommon for ML to power widely used applications and products that are integral parts of our life. Much like what was the case for Software Engineering, the proliferation of use of ML technology necessitates the evolution of the ML discipline from \"Coding\" to \"Engineering\". Gus Katsiapis offers a view from the trenches of using and building end-to-end ML platforms, and shares collective knowledge and experience, gothered over more than a decade of applied ML at Google. We hope this helps pave the way towards a world of ML Engineering. Kevin Haas offers an overview of TensorFlow Extended (TFX), the end-to-end machine learning platform for TensorFlow that powers products across all of Alphabet (and beyond). TFX helps effectively manage the end-to-end training and production workflow including model management, versioning, and serving, thereby helping one realize aspects of ML Engineering.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115009487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

A Representation Learning Framework for Property Graphs 属性图的表示学习框架

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330948

Yifan Hou, Hongzhi Chen, Changji Li, James Cheng, Ming Yang

{"title":"A Representation Learning Framework for Property Graphs","authors":"Yifan Hou, Hongzhi Chen, Changji Li, James Cheng, Ming Yang","doi":"10.1145/3292500.3330948","DOIUrl":"https://doi.org/10.1145/3292500.3330948","url":null,"abstract":"Representation learning on graphs, also called graph embedding, has demonstrated its significant impact on a series of machine learning applications such as classification, prediction and recommendation. However, existing work has largely ignored the rich information contained in the properties (or attributes) of both nodes and edges of graphs in modern applications, e.g., those represented by property graphs. To date, most existing graph embedding methods either focus on plain graphs with only the graph topology, or consider properties on nodes only. We propose PGE, a graph representation learning framework that incorporates both node and edge properties into the graph embedding procedure. PGE uses node clustering to assign biases to differentiate neighbors of a node and leverages multiple data-driven matrices to aggregate the property information of neighbors sampled based on a biased strategy. PGE adopts the popular inductive model for neighborhood aggregation. We provide detailed analyses on the efficacy of our method and validate the performance of PGE by showing how PGE achieves better embedding results than the state-of-the-art graph embedding methods on benchmark applications such as node classification and link prediction over real-world datasets.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114632436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 32

Learning From Networks: Algorithms, Theory, and Applications 从网络中学习:算法、理论和应用

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3332293

Xiao Huang, Peng Cui, Yuxiao Dong, Jundong Li, Huan Liu, J. Pei, Le Song, Jie Tang, Fei Wang, Hongxia Yang, Wenwu Zhu

{"title":"Learning From Networks: Algorithms, Theory, and Applications","authors":"Xiao Huang, Peng Cui, Yuxiao Dong, Jundong Li, Huan Liu, J. Pei, Le Song, Jie Tang, Fei Wang, Hongxia Yang, Wenwu Zhu","doi":"10.1145/3292500.3332293","DOIUrl":"https://doi.org/10.1145/3292500.3332293","url":null,"abstract":"Arguably, every entity in this universe is networked in one wayr another. With the prevalence of network data collected, such as social media and biological networks, learning from networks has become an essential task in many applications. It is well recognized that network data is intricate and large-scale, and analytic tasks on network data become more and more sophisticated. In this tutorial, we systematically review the area of learning from networks, including algorithms, theoretical analysis, and illustrative applications. Starting with a quick recollection of the exciting history of the area, we formulate the core technical problems. Then, we introduce the fundamental approaches, that is, the feature selection based approaches and the network embedding based approaches. Next, we extend our discussion to attributed networks, which are popular in practice. Last, we cover the latest hot topic, graph neural based approaches. For each group of approaches, we also survey the associated theoretical analysis and real-world application examples. Our tutorial also inspires a series of open problems and challenges that may lead to future breakthroughs. The authors are productive and seasoned researchers active in this area who represent a nice combination of academia and industry.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116918795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Blending Noisy Social Media Signals with Traditional Movement Variables to Predict Forced Migration 混合嘈杂的社交媒体信号和传统的移动变量来预测被迫迁移

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330774

L. Singh, Laila Wahedi, Yanchen Wang, Yifang Wei, Christo Kirov, Susan F. Martin, K. Donato, Yaguang Liu, Kornraphop Kawintiranon

引用次数: 16

Feedback Shaping: A Modeling Approach to Nurture Content Creation 反馈塑造:培养内容创造的建模方法

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330764

Ye Tu, Chun Lo, Yiping Yuan, S. Chatterjee

{"title":"Feedback Shaping: A Modeling Approach to Nurture Content Creation","authors":"Ye Tu, Chun Lo, Yiping Yuan, S. Chatterjee","doi":"10.1145/3292500.3330764","DOIUrl":"https://doi.org/10.1145/3292500.3330764","url":null,"abstract":"Social media platforms bring together content creators and content consumers through recommender systems like newsfeed. The focus of such recommender systems has thus far been primarily on modeling the content consumer preferences and optimizing for their experience. However, it is equally critical to nurture content creation by prioritizing the creators' interests, as quality content forms the seed for sustainable engagement and conversations, bringing in new consumers while retaining existing ones. In this work, we propose a modeling approach to predict how feedback from content consumers incentivizes creators. We then leverage this model to optimize the newsfeed experience for content creators by reshaping the feedback distribution, leading to a more active content ecosystem. Practically, we discuss how we balance the user experience for both consumers and creators, and how we carry out online A/B tests with strong network effects. We present a deployed use case on the LinkedIn newsfeed, where we used this approach to improve content creation significantly without compromising the consumers' experience.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126164636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Automatic Dialogue Summary Generation for Customer Service 自动对话摘要生成的客户服务

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330683

Chunyi Liu, Peng Wang, Jiang Xu, Zang Li, Jieping Ye

{"title":"Automatic Dialogue Summary Generation for Customer Service","authors":"Chunyi Liu, Peng Wang, Jiang Xu, Zang Li, Jieping Ye","doi":"10.1145/3292500.3330683","DOIUrl":"https://doi.org/10.1145/3292500.3330683","url":null,"abstract":"Dialogue summarization extracts useful information from a dialogue. It helps people quickly capture the highlights of a dialogue without going through long and sometimes twisted utterances. For customer service, it saves human resources currently required to write dialogue summaries. A main challenge of dialogue summarization is to design a mechanism to ensure the logic, integrity, and correctness of the summaries. In this paper, we introduce auxiliary key point sequences to solve this problem. A key point sequence describes the logic of the summary. In our training procedure, a key point sequence acts as an auxiliary label. It helps the model learn the logic of the summary. In the prediction procedure, our model predicts the key point sequence first and then uses it to guide the prediction of the summary. Along with the auxiliary key point sequence, we propose a novel Leader-Writer network. The Leader net predicts the key point sequence, and the Writer net predicts the summary based on the decoded key point sequence. The Leader net ensures the summary is logical and integral. The Writer net focuses on generating fluent sentences. We test our model on customer service scenarios. The results show that our model outperforms other models not only on BLEU and ROUGE-L score but also on logic and integrity.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129549232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 91

A Hierarchical Career-Path-Aware Neural Network for Job Mobility Prediction 面向职业流动预测的分层职业路径感知神经网络

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI: 10.1145/3292500.3330969

Qingxin Meng, Hengshu Zhu, Keli Xiao, Le Zhang, Hui Xiong

{"title":"A Hierarchical Career-Path-Aware Neural Network for Job Mobility Prediction","authors":"Qingxin Meng, Hengshu Zhu, Keli Xiao, Le Zhang, Hui Xiong","doi":"10.1145/3292500.3330969","DOIUrl":"https://doi.org/10.1145/3292500.3330969","url":null,"abstract":"The understanding of job mobility can benefit talent management operations in a number of ways, such as talent recruitment, talent development, and talent retention. While there is extensive literature showing the predictability of the organization-level job mobility patterns (e.g., in terms of the employee turnover rate), there are no effective solutions for supporting the understanding of job mobility at an individual level. To this end, in this paper, we propose a hierarchical career-path-aware neural network for learning individual-level job mobility. Specifically, we aim at answering two questions related to individuals in their career paths: 1) who will be the next employer? 2) how long will the individual work in the new position? Specifically, our model exploits a hierarchical neural network structure with embedded attention mechanism for characterizing the internal and external job mobility. Also, it takes personal profile information into consideration in the learning process. Finally, the extensive results on real-world data show that the proposed model can lead to significant improvements in prediction accuracy for the two aforementioned prediction problems. Moreover, we show that the above two questions are well addressed by our model with a certain level of interpretability. For the case studies, we provide data-driven evidence showing interesting patterns associated with various factors (e.g., job duration, firm type, etc.) in the job mobility prediction process.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129600037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43