Daizong Ding, Mi Zhang, Xudong Pan, Min Yang, Xiangnan He
{"title":"Modeling Extreme Events in Time Series Prediction","authors":"Daizong Ding, Mi Zhang, Xudong Pan, Min Yang, Xiangnan He","doi":"10.1145/3292500.3330896","DOIUrl":"https://doi.org/10.1145/3292500.3330896","url":null,"abstract":"Time series prediction is an intensively studied topic in data mining. In spite of the considerable improvements, recent deep learning-based methods overlook the existence of extreme events, which result in weak performance when applying them to real time series. Extreme events are rare and random, but do play a critical role in many real applications, such as the forecasting of financial crisis and natural disasters. In this paper, we explore the central theme of improving the ability of deep learning on modeling extreme events for time series prediction. Through the lens of formal analysis, we first find that the weakness of deep learning methods roots in the conventional form of quadratic loss. To address this issue, we take inspirations from the Extreme Value Theory, developing a new form of loss called Extreme Value Loss (EVL) for detecting the future occurrence of extreme events. Furthermore, we propose to employ Memory Network in order to memorize extreme events in historical records.By incorporating EVL with an adapted memory network module, we achieve an end-to-end framework for time series prediction with extreme events. Through extensive experiments on synthetic data and two real datasets of stock and climate, we empirically validate the effectiveness of our framework. Besides, we also provide a proper choice for hyper-parameters in our proposed framework by conducting several additional experiments.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130235691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DuerQuiz","authors":"Chuan Qin, Hengshu Zhu, Chen Zhu, Tong Xu, Fuzhen Zhuang, Chao Ma, Jingshuai Zhang, Hui Xiong","doi":"10.1145/3292500.3330706","DOIUrl":"https://doi.org/10.1145/3292500.3330706","url":null,"abstract":"In talent recruitment, the job interview aims at selecting the right candidates for the right jobs through assessing their skills and experiences in relation to the job positions. While tremendous efforts have been made in improving job interviews, a long-standing challenge is how to design appropriate interview questions for comprehensively assessing the competencies that may be deemed relevant and representative for person-job fit. To this end, in this research, we focus on the development of a personalized question recommender system, namely DuerQuiz, for enhancing the job interview assessment. DuerQuiz is a fully deployed system, in which a knowledge graph of job skills, Skill-Graph, has been built for comprehensively modeling the relevant competencies that should be assessed in the job interview. Specifically, we first develop a novel skill entity extraction approach based on a bidirectional Long Short-Term Memory (LSTM) with a Conditional Random Field (CRF) layer (LSTM-CRF) neural network enhanced with adapted gate mechanism. In particular, to improve the reliability of extracted skill entities, we design a label propagation method based on more than 10 billion click-through data from the large-scale Baidu query logs. Furthermore, we discover the hypernym-hyponym relations between skill entities and construct the Skill-Graph by leveraging the classifier trained with extensive contextual features. Finally, we design a personalized question recommendation algorithm based on the Skill-Graph for improving the efficiency and effectiveness of job interview assessment. Extensive experiments on real-world recruitment data clearly validate the effectiveness of DuerQuiz, which had been deployed for generating written exercises in the 2018 Baidu campus recruitment event and received remarkable performances in terms of efficiency and effectiveness for selecting outstanding talents compared with a traditional non-personalized human-only assessment approach.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129257474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FDML","authors":"Yaochen Hu, Di Niu, Jianming Yang, Shengping Zhou","doi":"10.1145/3292500.3330765","DOIUrl":"https://doi.org/10.1145/3292500.3330765","url":null,"abstract":"Most current distributed machine learning systems try to scale up model training by using a data-parallel architecture that divides the computation for different samples among workers. We study distributed machine learning from a different motivation, where the information about the same samples, e.g., users and objects, are owned by several parities that wish to collaborate but do not want to share raw data with each other. We propose an asynchronous stochastic gradient descent (SGD) algorithm for such a feature distributed machine learning (FDML) problem, to jointly learn from distributed features, with theoretical convergence guarantees under bounded asynchrony. Our algorithm does not require sharing the original features or even local model parameters between parties, thus preserving the data locality. The system can also easily incorporate differential privacy mechanisms to preserve a higher level of privacy. We implement the FDML system in a parameter server architecture and compare our system with fully centralized learning (which violates data locality) and learning based on only local features, through extensive experiments performed on both a public data set a9a, and a large dataset of 5,000,000 records and 8700 decentralized features from three collaborating apps at Tencent including Tencent MyApp, Tecent QQ Browser and Tencent Mobile Safeguard. Experimental results have demonstrated that the proposed FDML system can be used to significantly enhance app recommendation in Tencent MyApp by leveraging user and item features from other apps, while preserving the locality and privacy of features in each individual app to a high degree.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"6 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120901263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Visual Dialog Augmented Interactive Recommender System","authors":"Tong Yu, Yilin Shen, Hongxia Jin","doi":"10.1145/3292500.3330991","DOIUrl":"https://doi.org/10.1145/3292500.3330991","url":null,"abstract":"Traditional recommender systems rely on user feedback such as ratings or clicks to the items, to analyze the user interest and provide personalized recommendations. However, rating or click feedback are limited in that they do not exactly tell why users like or dislike an item. If a user does not like the recommendations and can not effectively express the reasons via rating and clicking, the feedback from the user may be very sparse. These limitations lead to inefficient model learning of the recommender system. To address these limitations, more effective user feedback to the recommendations should be designed, so that the system can effectively understand a user's preference and improve the recommendations over time. In this paper, we propose a novel dialog-based recommender system to interactively recommend a list of items with visual appearance. At each time, the user receives a list of recommended items with visual appearance. The user can point to some items and describe their feedback, such as the desired features in the items they want in natural language. With this natural language based feedback, the recommender system updates and provides another list of items. To model the user behaviors of viewing, commenting and clicking on a list of items, we propose a visual dialog augmented cascade model. To efficiently understand the user preference and learn the model, exploration should be encouraged to provide more diverse recommendations to quickly collect user feedback on more attributes of the items. We propose a variant of the cascading bandits, where the neural representations of the item images and user feedback in natural language are utilized. In a task of recommending a list of footwear, we show that our visual dialog augmented interactive recommender needs around 41.03% rounds of recommendations, compared to the traditional interactive recommender only relying on the user click behavior.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121247124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Relation Extraction via Domain-aware Transfer Learning","authors":"Shimin Di, Yanyan Shen, Lei Chen","doi":"10.1145/3292500.3330890","DOIUrl":"https://doi.org/10.1145/3292500.3330890","url":null,"abstract":"Relation extraction in knowledge base construction has been researched for the last decades due to its applicability to many problems. Most classical works, such as supervised information extraction and distant supervision, focus on how to construct the knowledge base (KB) by utilizing the large number of labels or certain related KBs. However, in many real-world scenarios, the existing methods may not perform well when a new knowledge base is required but only scarce labels or few related KBs available. In this paper, we propose a novel approach called, Relation Extraction via Domain-aware Transfer Learning (ReTrans), to extract relation mentions from a given text corpus by exploring the experience from a large amount of existing KBs which may not be closely related to the target relation. We first propose to initialize the representation of relation mentions from the massive text corpus and update those representations according to existing KBs. Based on the representations of relation mentions, we investigate the contribution of each KB to the target task and propose to select useful KBs for boosting the effectiveness of the proposed approach. Based on selected KBs, we develop a novel domain-aware transfer learning framework to transfer knowledge from source domains to the target domain, aiming to infer the true relation mentions in the unstructured text corpus. Most importantly, we give the stability and generalization bound of ReTrans. Experimental results on the real world datasets well demonstrate that the effectiveness of our approach, which outperforms all the state-of-the-art baselines.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127146003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rui Yan, Ran Le, Yang Song, Tao Zhang, Xiangliang Zhang, Dongyan Zhao
{"title":"Interview Choice Reveals Your Preference on the Market: To Improve Job-Resume Matching through Profiling Memories","authors":"Rui Yan, Ran Le, Yang Song, Tao Zhang, Xiangliang Zhang, Dongyan Zhao","doi":"10.1145/3292500.3330963","DOIUrl":"https://doi.org/10.1145/3292500.3330963","url":null,"abstract":"Online recruitment services are now rapidly changing the landscape of hiring traditions on the job market. There are hundreds of millions of registered users with resumes, and tens of millions of job postings available on the Web. Learning good job-resume matching for recruitment services is important. Existing studies on job-resume matching generally focus on learning good representations of job descriptions and resume texts with comprehensive matching structures. We assume that it would bring benefits to learn the preference of both recruiters and job-seekers from previous interview histories and expect such preference is helpful to improve job-resume matching. To this end, in this paper, we propose a novel matching network with preference modeled. The key idea is to explore the latent preference given the history of all interviewed candidates for a job posting and the history of all job applications for a particular talent. To be more specific, we propose a profiling memory module to learn the latent preference representation by interacting with both the job and resume sides. We then incorporate the preference into the matching framework as an end-to-end learnable neural network. Based on the real-world data from an online recruitment platform namely \"Boss Zhipin\", the experimental results show that the proposed model could improve the job-resume matching performance against a series of state-of-the-art methods. In this way, we demonstrate that recruiters and talents indeed have preference and such preference can improve job-resume matching on the job market.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127773673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discovering Unexpected Local Nonlinear Interactions in Scientific Black-box Models","authors":"Michael Doron, Idan Segev, Dafna Shahaf","doi":"10.1145/3292500.3330886","DOIUrl":"https://doi.org/10.1145/3292500.3330886","url":null,"abstract":"Scientific computational models are crucial for analyzing and understanding complex real-life systems that are otherwise difficult for experimentation. However, the complex behavior and the vast input-output space of these models often make them opaque, slowing the discovery of novel phenomena. In this work, we present HINT (Hessian INTerestingness) -- a new algorithm that can automatically and systematically explore black-box models and highlight local nonlinear interactions in the input-output space of the model. This tool aims to facilitate the discovery of interesting model behaviors that are unknown to the researchers. Using this simple yet powerful tool, we were able to correctly rank all pairwise interactions in known benchmark models and do so faster and with greater accuracy than state-of-the-art methods. We further applied HINT to existing computational neuroscience models, and were able to reproduce important scientific discoveries that were published years after the creation of those models. Finally, we ran HINT on two real-world models (in neuroscience and earth science) and found new behaviors of the model that were of value to domain experts.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128765832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optuna: A Next-generation Hyperparameter Optimization Framework","authors":"Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, Masanori Koyama","doi":"10.1145/3292500.3330701","DOIUrl":"https://doi.org/10.1145/3292500.3330701","url":null,"abstract":"The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API that allows users to construct the parameter search space dynamically, (2) efficient implementation of both searching and pruning strategies, and (3) easy-to-setup, versatile architecture that can be deployed for various purposes, ranging from scalable distributed computing to light-weight experiment conducted via interactive interface. In order to prove our point, we will introduce Optuna, an optimization software which is a culmination of our effort in the development of a next generation optimization software. As an optimization software designed with define-by-run principle, Optuna is particularly the first of its kind. We will present the design-techniques that became necessary in the development of the software that meets the above criteria, and demonstrate the power of our new design through experimental results and real world applications. Our software is available under the MIT license (https://github.com/pfnet/optuna/).","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114933029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Aharon, O. Somekh, Avi Shahar, Assaf Singer, Baruch Trayvas, Hadas Vogel, Dobrislav Dobrev
{"title":"Carousel Ads Optimization in Yahoo Gemini Native","authors":"M. Aharon, O. Somekh, Avi Shahar, Assaf Singer, Baruch Trayvas, Hadas Vogel, Dobrislav Dobrev","doi":"10.1145/3292500.3330740","DOIUrl":"https://doi.org/10.1145/3292500.3330740","url":null,"abstract":"Yahoo's native advertising (also known as Gemini native) serves billions of ad impressions daily, reaching a yearly run-rate of many hundred of millions USD. Driving Gemini native models for predicting both click probability (pCTR) and conversion probability (pCONV) is OFFSET - a feature enhanced collaborative-filtering (CF) based event prediction algorithm. The predicted pCTRs are then used in Gemini native auctions to determine which ads to present for each serving event. A fast growing segment of Gemini native is Carousel ads that include several cards (or assets) which are used to populate several slots within the ad. Since Carousel ad slots are not symmetrical and some are more conspicuous than others, it is beneficial to render assets to slots in a way that maximizes revenue. In this work we present a post-auction successive elimination based approach for ranking assets according to their click trough rate (CTR) and render the carousel accordingly, placing higher CTR assets in more conspicuous slots. After a successful online bucket showing 8.6% CTR and 4.3% CPM (or revenue) lifts over a control bucket that uses predefined advertisers assets-to-slots mapping, the carousel asset optimization (CAO) system was pushed to production and is serving all Gemini native traffic since. A few months after CAO deployment, we have already measured an almost 40% increase in carousel ads revenue. Moreover, the entire revenue growth is related to CAO traffic increase due to additional advertiser demand, which demonstrates a high advertisers' satisfaction of the product.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132222660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Earth Observations from a New Generation of Geostationary Satellites","authors":"R. Nemani","doi":"10.1145/3292500.3340413","DOIUrl":"https://doi.org/10.1145/3292500.3340413","url":null,"abstract":"The latest generation of geostationary satellites carry sensors such as the Advanced Baseline Imager (GOES-16/17) and the Advanced Himawari Imager (Himawari-8/9) that closely mimic the spatial and spectral characteristics of widely used polar orbiting sensors such as EOS/MODIS. More importantly, they provide observations at 1-5-15 minute intervals, instead of twice a day from MODIS, offering unprecedented opportunities for monitoring large parts of the Earth. In addition to serving the needs of weather forecasting, these observations offer new and exciting opportunities in managing solar power, fighting wildfires, and tracking air pollution. Creation of actionable information in near realtime from these data streams is a challenge that is best addressed through collaborative efforts among the industry, academia and government agencies.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133211958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}