Zhaodong Wang, Zhiwei Qin, Xiaocheng Tang, Jieping Ye, Hongtu Zhu
{"title":"Deep Reinforcement Learning with Knowledge Transfer for Online Rides Order Dispatching","authors":"Zhaodong Wang, Zhiwei Qin, Xiaocheng Tang, Jieping Ye, Hongtu Zhu","doi":"10.1109/ICDM.2018.00077","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00077","url":null,"abstract":"Ride dispatching is a central operation task on a ride-sharing platform to continuously match drivers to trip-requesting passengers. In this work, we model the ride dispatching problem as a Markov Decision Process and propose learning solutions based on deep Q-networks with action search to optimize the dispatching policy for drivers on ride-sharing platforms. We train and evaluate dispatching agents for this challenging decision task using real-world spatio-temporal trip data from the DiDi ride-sharing platform. A large-scale dispatching system typically supports many geographical locations with diverse demand-supply settings. To increase learning adaptability and efficiency, we propose a new transfer learning method Correlated Feature Progressive Transfer, along with two existing methods, enabling knowledge transfer in both spatial and temporal spaces. Through an extensive set of experiments, we demonstrate the learning and optimization capabilities of our deep reinforcement learning algorithms. We further show that dispatching policies learned by transferring knowledge from a source city to target cities or across temporal space within the same city significantly outperform those without transfer learning.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115684615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lamine Diop, Cheikh Talibouya Diop, A. Giacometti, Dominique H. Li, Arnaud Soulet
{"title":"Sequential Pattern Sampling with Norm Constraints","authors":"Lamine Diop, Cheikh Talibouya Diop, A. Giacometti, Dominique H. Li, Arnaud Soulet","doi":"10.1109/ICDM.2018.00024","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00024","url":null,"abstract":"In recent years, the field of pattern mining has shifted to user-centered methods. In such a context, it is necessary to have a tight coupling between the system and the user where mining techniques provide results at any time or within a short response time of only few seconds. Pattern sampling is a non-exhaustive method for instantly discovering relevant patterns that ensures a good interactivity while providing strong statistical guarantees due to its random nature. Curiously, such an approach investigated for itemsets and subgraphs has not yet been applied to sequential patterns, which are useful for a wide range of mining tasks and application fields. In this paper, we propose the first method for sequential pattern sampling. In addition to address sequential data, the originality of our approach is to introduce a constraint on the norm to control the length of the drawn patterns and to avoid the pitfall of the \"long tail\" where the rarest patterns flood the user. We propose a new constrained two-step random procedure, named CSSampling, that randomly draws sequential patterns according to frequency with an interval constraint on the norm. We demonstrate that this method performs an exact sampling. Moreover, despite the use of rejection sampling, the experimental study shows that CSSampling remains efficient and the constraint helps to draw general patterns of the \"head\". We also illustrate how to benefit from these sampled patterns to instantly build an associative classifier dedicated to sequences. This classification approach rivals state of the art proposals showing the interest of constrained sequential pattern sampling.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116652883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Low Rank Weighted Graph Convolutional Approach to Weather Prediction","authors":"T. Wilson, P. Tan, L. Luo","doi":"10.1109/ICDM.2018.00078","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00078","url":null,"abstract":"Weather forecasting is an important but challenging problem as one must contend with the inherent non-linearities and spatiotemporal autocorrelation present in the data. This paper presents a novel deep learning approach based on a coupled weighted graph convolutional LSTM (WGC-LSTM) to address these challenges. Specifically, our proposed approach uses an LSTM to capture the inherent temporal autocorrelation of the data and a graph convolution to model its spatial relationships. As the weather condition can be influenced by various spatial factors besides the distance between locations, e.g., topography, prevailing winds and jet streams, imposing a fixed graph structure based on the proximity between locations is insufficient to train a robust deep learning model. Instead, our proposed approach treats the adjacency matrix of the graph as a model parameter that can be learned from the training data. However, this introduces an additional O(|V|^2) parameters to be estimated, where V is the number of locations. With large graphs this may also lead to slower performance as well as susceptibility to overfitting. We propose a modified version of our approach that can address this difficulty by assuming that the adjacency matrix is either sparse or low rank. Experimental results using two real-world weather datasets show that WGC-LSTM outperforms all other baseline methods for the majority of the evaluated locations.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127171770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Learning Based Scalable Inference of Uncertain Opinions","authors":"Xujiang Zhao, F. Chen, Jin-Hee Cho","doi":"10.1109/ICDM.2018.00096","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00096","url":null,"abstract":"Subjective Logic (SL) is one of well-known belief models that can explicitly deal with uncertain opinions and infer unknown opinions based on a rich set of operators of fusing multiple opinions. Due to high simplicity and applicability, SL has been popularly applied in a variety of decision making in the area of cybersecurity, opinion models, and/or trust / social network analysis. However, SL has been facing an issue of scalability to deal with a large-scale network data. In addition, SL has shown a bounded prediction accuracy due to its inherent parametric nature by treating heterogeneous data and network structure homogeneously based on the assumption of a Bayesian network. In this work, we take one step further to deal with uncertain opinions for unknown opinion inference. We propose a deep learning (DL)-based opinion inference model while node-level opinions are still formalized based on SL. The proposed DL-based opinion inference model handles node-level opinions explicitly in a large-scale network using graph convoluational network (GCN) and variational autoencoder (VAE) techniques. We adopted the GCN and VAE due to their powerful learning capabilities in dealing with a large-scale network data without parametric fusion operators and/or Bayesian network assumption. This work is the first that leverages the merits of both DL (i.e., GCN and VAE) and a belief model (i.e., SL) where each node level opinion is modeled by the formalism of SL while GCN and VAE are used to achieve non-parametric learning with low complexity. By mapping the node-level opinions modeled by the GCN to their equivalent Beta PDFs (probability density functions), we develop a network-driven VAE to maximize prediction accuracy of unknown opinions while significantly reducing algorithmic complexity. We validate our proposed DL-based algorithm using real-world datasets via extensive simulation experiments for comparative performance analysis.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"280 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123430706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Which Outlier Detector Should I use?","authors":"K. Ting, Sunil Aryal, T. Washio","doi":"10.1109/ICDM.2018.00015","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00015","url":null,"abstract":"This tutorial has four aims: (1) Providing the current comparative works on different outlier detectors, and analysing the strengths and weaknesses of these works and their recommendations. (2) Presenting non-obvious applications of outlier detectors. This provides examples of how outlier detectors are used in areas which are not normally considered to be the domains of outlier detection. (3) Inviting the research community to explore future research directions, in terms of both comparative study and outlier detection in general. (4) Giving an advice on the factors to consider when choosing an outlier detector, and strengths and weaknesses of some \"top\" recommended algorithms based on the current understanding in the literature.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125360551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Record2Vec: Unsupervised Representation Learning for Structured Records","authors":"Adelene Y. L. Sim, Andrew Borthwick","doi":"10.1109/ICDM.2018.00165","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00165","url":null,"abstract":"Structured records - data with a fixed number of descriptive fields (or attributes) - are often represented by one-hot encoded or term frequency-inverse document frequency (TF-IDF) weighted vectors. These vectors are typically sparse and long, and are inefficient in representing structured records. Here, we introduce Record2Vec, a framework for generating dense embeddings of structured records by training associations between attributes within record instances. We build our embedding from a simple premise that structured records have attributes that are associated, and therefore we can train the embedding of an attribute based on other attributes (or context), much like how we train embeddings for words based on their surrounding context. Because this embedding technique is general and does not assume the availability of any labeled data, it is extendable across different domains and fields. We demonstrate its utility in the context of clustering, record matching, movie rating and movie genre prediction.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129519387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Local Low-Rank Hawkes Processes for Temporal User-Item Interactions","authors":"Jin Shang, Mingxuan Sun","doi":"10.1109/ICDM.2018.00058","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00058","url":null,"abstract":"Hawkes processes have become very popular in modeling multiple recurrent user-item interaction events that exhibit mutual-excitation properties in various domains. Generally, modeling the interaction sequence of each user-item pair as an independent Hawkes process is ineffective since the prediction accuracy of future event occurrences for users and items with few observed interactions is low. On the other hand, multivariate Hawkes processes (MHPs) can be used to handle multi-dimensional random processes where different dimensions are correlated with each other. However, an MHP either fails to describe the correct mutual influence between dimensions or become computational inhibitive in most real-world events involving a large collection of users and items. To tackle this challenge, we propose local low-rank Hawkes processes to model large-scale user-item interactions, which efficiently captures the correlations of Hawkes processes in different dimensions. In addition, we design an efficient convex optimization algorithm to estimate model parameters and present a parallel algorithm to further increase the computation efficiency. Extensive experiments on real-world datasets demonstrate the performance improvements of our model in comparison with the state of the art.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129540963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liang Zhang, Keli Xiao, Hengshu Zhu, Chuanren Liu, Jingyuan Yang, Bo Jin
{"title":"CADEN: A Context-Aware Deep Embedding Network for Financial Opinions Mining","authors":"Liang Zhang, Keli Xiao, Hengshu Zhu, Chuanren Liu, Jingyuan Yang, Bo Jin","doi":"10.1109/ICDM.2018.00091","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00091","url":null,"abstract":"Following the recent advances of artificial intelligence, financial text mining has gained new potential to benefit theoretical research with practice impacts. An essential research question for financial text mining is how to accurately identify the actual financial opinions (e.g., bullish or bearish) behind words in plain text. Traditional methods mainly consider this task as a text classification problem with solutions based on machine learning algorithms. However, most of them rely heavily on the hand-crafted features extracted from the text. Indeed, a critical issue along this line is that the latent global and local contexts of the financial opinions usually cannot be fully captured. To this end, we propose a context-aware deep embedding network for financial text mining, named CADEN, by jointly encoding the global and local contextual information. Especially, we capture and include an attitude-aware user embedding to enhance the performance of our model. We validate our method with extensive experiments based on a real-world dataset and several state-of-the-art baselines for investor sentiment recognition. Our results show a consistently superior performance of our approach for identifying the financial opinions from texts of different formats.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128658890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clustered Lifelong Learning Via Representative Task Selection","authors":"Gan Sun, Yang Cong, Yu Kong, Xiaowei Xu","doi":"10.1109/ICDM.2018.00167","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00167","url":null,"abstract":"Consider the lifelong machine learning problem where the objective is to learn new consecutive tasks depending on previously accumulated experiences, i.e., knowledge library. In comparison with most state-of-the-arts which adopt knowledge library with prescribed size, in this paper, we propose a new incremental clustered lifelong learning model with two libraries: feature library and model library, called Clustered Lifelong Learning (CL3), in which the feature library maintains a set of learned features common across all the encountered tasks, and the model library is learned by identifying and adding representative models (clusters). When a new task arrives, the original task model can be firstly reconstructed by representative models measured by capped l2-norm distance, i.e., effectively assigning the new task model to multiple representative models under feature library. Based on this assignment knowledge of new task, the objective of our CL3 model is to transfer the knowledge from both feature library and model library to learn the new task. The new task 1) with a higher outlier probability will then be judged as a new representative, and used to refine both feature library and representative models over time; 2) with lower outlier probability will only update the feature library. For the model optimisation, we cast this problem as an alternating direction minimization problem. To this end, the performance of CL3 is evaluated through comparing with most lifelong learning models, even some batch clustered multi-task learning models.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127037764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Characteristic Subspace Learning for Time Series Classification","authors":"Yuanduo He, Jialiang Pei, Xu Chu, Yasha Wang, Zhu Jin, Guangju Peng","doi":"10.1109/ICDM.2018.00128","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00128","url":null,"abstract":"This paper presents a novel time series classification algorithm. It exploits time-delay embedding to transform time series into a set of points as a distribution, and attempt to classify time series by classifying corresponding distributions. It proposes a novel geometrical feature, i.e. characteristic subspace, from embedding points for classification, and leverages class-weighted support vector machine (SVM) to learn for it. An efficient boosting strategy is also developed to enable a linear time training. The experiments show great potentials of this novel algorithm on accuracy, efficiency and interpretability.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130631833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}