2018 IEEE International Conference on Data Mining (ICDM)最新文献

筛选
英文 中文
Deep Reinforcement Learning with Knowledge Transfer for Online Rides Order Dispatching 基于知识转移的深度强化学习在在线订单调度中的应用
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00077
Zhaodong Wang, Zhiwei Qin, Xiaocheng Tang, Jieping Ye, Hongtu Zhu
{"title":"Deep Reinforcement Learning with Knowledge Transfer for Online Rides Order Dispatching","authors":"Zhaodong Wang, Zhiwei Qin, Xiaocheng Tang, Jieping Ye, Hongtu Zhu","doi":"10.1109/ICDM.2018.00077","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00077","url":null,"abstract":"Ride dispatching is a central operation task on a ride-sharing platform to continuously match drivers to trip-requesting passengers. In this work, we model the ride dispatching problem as a Markov Decision Process and propose learning solutions based on deep Q-networks with action search to optimize the dispatching policy for drivers on ride-sharing platforms. We train and evaluate dispatching agents for this challenging decision task using real-world spatio-temporal trip data from the DiDi ride-sharing platform. A large-scale dispatching system typically supports many geographical locations with diverse demand-supply settings. To increase learning adaptability and efficiency, we propose a new transfer learning method Correlated Feature Progressive Transfer, along with two existing methods, enabling knowledge transfer in both spatial and temporal spaces. Through an extensive set of experiments, we demonstrate the learning and optimization capabilities of our deep reinforcement learning algorithms. We further show that dispatching policies learned by transferring knowledge from a source city to target cities or across temporal space within the same city significantly outperform those without transfer learning.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115684615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 85
Sequential Pattern Sampling with Norm Constraints 范数约束下的顺序模式抽样
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00024
Lamine Diop, Cheikh Talibouya Diop, A. Giacometti, Dominique H. Li, Arnaud Soulet
{"title":"Sequential Pattern Sampling with Norm Constraints","authors":"Lamine Diop, Cheikh Talibouya Diop, A. Giacometti, Dominique H. Li, Arnaud Soulet","doi":"10.1109/ICDM.2018.00024","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00024","url":null,"abstract":"In recent years, the field of pattern mining has shifted to user-centered methods. In such a context, it is necessary to have a tight coupling between the system and the user where mining techniques provide results at any time or within a short response time of only few seconds. Pattern sampling is a non-exhaustive method for instantly discovering relevant patterns that ensures a good interactivity while providing strong statistical guarantees due to its random nature. Curiously, such an approach investigated for itemsets and subgraphs has not yet been applied to sequential patterns, which are useful for a wide range of mining tasks and application fields. In this paper, we propose the first method for sequential pattern sampling. In addition to address sequential data, the originality of our approach is to introduce a constraint on the norm to control the length of the drawn patterns and to avoid the pitfall of the \"long tail\" where the rarest patterns flood the user. We propose a new constrained two-step random procedure, named CSSampling, that randomly draws sequential patterns according to frequency with an interval constraint on the norm. We demonstrate that this method performs an exact sampling. Moreover, despite the use of rejection sampling, the experimental study shows that CSSampling remains efficient and the constraint helps to draw general patterns of the \"head\". We also illustrate how to benefit from these sampled patterns to instantly build an associative classifier dedicated to sequences. This classification approach rivals state of the art proposals showing the interest of constrained sequential pattern sampling.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116652883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
A Low Rank Weighted Graph Convolutional Approach to Weather Prediction 一种低秩加权图卷积天气预报方法
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00078
T. Wilson, P. Tan, L. Luo
{"title":"A Low Rank Weighted Graph Convolutional Approach to Weather Prediction","authors":"T. Wilson, P. Tan, L. Luo","doi":"10.1109/ICDM.2018.00078","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00078","url":null,"abstract":"Weather forecasting is an important but challenging problem as one must contend with the inherent non-linearities and spatiotemporal autocorrelation present in the data. This paper presents a novel deep learning approach based on a coupled weighted graph convolutional LSTM (WGC-LSTM) to address these challenges. Specifically, our proposed approach uses an LSTM to capture the inherent temporal autocorrelation of the data and a graph convolution to model its spatial relationships. As the weather condition can be influenced by various spatial factors besides the distance between locations, e.g., topography, prevailing winds and jet streams, imposing a fixed graph structure based on the proximity between locations is insufficient to train a robust deep learning model. Instead, our proposed approach treats the adjacency matrix of the graph as a model parameter that can be learned from the training data. However, this introduces an additional O(|V|^2) parameters to be estimated, where V is the number of locations. With large graphs this may also lead to slower performance as well as susceptibility to overfitting. We propose a modified version of our approach that can address this difficulty by assuming that the adjacency matrix is either sparse or low rank. Experimental results using two real-world weather datasets show that WGC-LSTM outperforms all other baseline methods for the majority of the evaluated locations.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127171770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Deep Learning Based Scalable Inference of Uncertain Opinions 基于深度学习的不确定意见可扩展推理
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00096
Xujiang Zhao, F. Chen, Jin-Hee Cho
{"title":"Deep Learning Based Scalable Inference of Uncertain Opinions","authors":"Xujiang Zhao, F. Chen, Jin-Hee Cho","doi":"10.1109/ICDM.2018.00096","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00096","url":null,"abstract":"Subjective Logic (SL) is one of well-known belief models that can explicitly deal with uncertain opinions and infer unknown opinions based on a rich set of operators of fusing multiple opinions. Due to high simplicity and applicability, SL has been popularly applied in a variety of decision making in the area of cybersecurity, opinion models, and/or trust / social network analysis. However, SL has been facing an issue of scalability to deal with a large-scale network data. In addition, SL has shown a bounded prediction accuracy due to its inherent parametric nature by treating heterogeneous data and network structure homogeneously based on the assumption of a Bayesian network. In this work, we take one step further to deal with uncertain opinions for unknown opinion inference. We propose a deep learning (DL)-based opinion inference model while node-level opinions are still formalized based on SL. The proposed DL-based opinion inference model handles node-level opinions explicitly in a large-scale network using graph convoluational network (GCN) and variational autoencoder (VAE) techniques. We adopted the GCN and VAE due to their powerful learning capabilities in dealing with a large-scale network data without parametric fusion operators and/or Bayesian network assumption. This work is the first that leverages the merits of both DL (i.e., GCN and VAE) and a belief model (i.e., SL) where each node level opinion is modeled by the formalism of SL while GCN and VAE are used to achieve non-parametric learning with low complexity. By mapping the node-level opinions modeled by the GCN to their equivalent Beta PDFs (probability density functions), we develop a network-driven VAE to maximize prediction accuracy of unknown opinions while significantly reducing algorithmic complexity. We validate our proposed DL-based algorithm using real-world datasets via extensive simulation experiments for comparative performance analysis.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"280 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123430706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Which Outlier Detector Should I use? 我应该使用哪种离群值检测器?
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00015
K. Ting, Sunil Aryal, T. Washio
{"title":"Which Outlier Detector Should I use?","authors":"K. Ting, Sunil Aryal, T. Washio","doi":"10.1109/ICDM.2018.00015","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00015","url":null,"abstract":"This tutorial has four aims: (1) Providing the current comparative works on different outlier detectors, and analysing the strengths and weaknesses of these works and their recommendations. (2) Presenting non-obvious applications of outlier detectors. This provides examples of how outlier detectors are used in areas which are not normally considered to be the domains of outlier detection. (3) Inviting the research community to explore future research directions, in terms of both comparative study and outlier detection in general. (4) Giving an advice on the factors to consider when choosing an outlier detector, and strengths and weaknesses of some \"top\" recommended algorithms based on the current understanding in the literature.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125360551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Record2Vec: Unsupervised Representation Learning for Structured Records 结构化记录的无监督表示学习
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00165
Adelene Y. L. Sim, Andrew Borthwick
{"title":"Record2Vec: Unsupervised Representation Learning for Structured Records","authors":"Adelene Y. L. Sim, Andrew Borthwick","doi":"10.1109/ICDM.2018.00165","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00165","url":null,"abstract":"Structured records - data with a fixed number of descriptive fields (or attributes) - are often represented by one-hot encoded or term frequency-inverse document frequency (TF-IDF) weighted vectors. These vectors are typically sparse and long, and are inefficient in representing structured records. Here, we introduce Record2Vec, a framework for generating dense embeddings of structured records by training associations between attributes within record instances. We build our embedding from a simple premise that structured records have attributes that are associated, and therefore we can train the embedding of an attribute based on other attributes (or context), much like how we train embeddings for words based on their surrounding context. Because this embedding technique is general and does not assume the availability of any labeled data, it is extendable across different domains and fields. We demonstrate its utility in the context of clustering, record matching, movie rating and movie genre prediction.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129519387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Local Low-Rank Hawkes Processes for Temporal User-Item Interactions 临时用户-项目交互的局部低秩Hawkes过程
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00058
Jin Shang, Mingxuan Sun
{"title":"Local Low-Rank Hawkes Processes for Temporal User-Item Interactions","authors":"Jin Shang, Mingxuan Sun","doi":"10.1109/ICDM.2018.00058","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00058","url":null,"abstract":"Hawkes processes have become very popular in modeling multiple recurrent user-item interaction events that exhibit mutual-excitation properties in various domains. Generally, modeling the interaction sequence of each user-item pair as an independent Hawkes process is ineffective since the prediction accuracy of future event occurrences for users and items with few observed interactions is low. On the other hand, multivariate Hawkes processes (MHPs) can be used to handle multi-dimensional random processes where different dimensions are correlated with each other. However, an MHP either fails to describe the correct mutual influence between dimensions or become computational inhibitive in most real-world events involving a large collection of users and items. To tackle this challenge, we propose local low-rank Hawkes processes to model large-scale user-item interactions, which efficiently captures the correlations of Hawkes processes in different dimensions. In addition, we design an efficient convex optimization algorithm to estimate model parameters and present a parallel algorithm to further increase the computation efficiency. Extensive experiments on real-world datasets demonstrate the performance improvements of our model in comparison with the state of the art.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129540963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
CADEN: A Context-Aware Deep Embedding Network for Financial Opinions Mining 基于上下文感知的金融意见挖掘深度嵌入网络
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00091
Liang Zhang, Keli Xiao, Hengshu Zhu, Chuanren Liu, Jingyuan Yang, Bo Jin
{"title":"CADEN: A Context-Aware Deep Embedding Network for Financial Opinions Mining","authors":"Liang Zhang, Keli Xiao, Hengshu Zhu, Chuanren Liu, Jingyuan Yang, Bo Jin","doi":"10.1109/ICDM.2018.00091","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00091","url":null,"abstract":"Following the recent advances of artificial intelligence, financial text mining has gained new potential to benefit theoretical research with practice impacts. An essential research question for financial text mining is how to accurately identify the actual financial opinions (e.g., bullish or bearish) behind words in plain text. Traditional methods mainly consider this task as a text classification problem with solutions based on machine learning algorithms. However, most of them rely heavily on the hand-crafted features extracted from the text. Indeed, a critical issue along this line is that the latent global and local contexts of the financial opinions usually cannot be fully captured. To this end, we propose a context-aware deep embedding network for financial text mining, named CADEN, by jointly encoding the global and local contextual information. Especially, we capture and include an attitude-aware user embedding to enhance the performance of our model. We validate our method with extensive experiments based on a real-world dataset and several state-of-the-art baselines for investor sentiment recognition. Our results show a consistently superior performance of our approach for identifying the financial opinions from texts of different formats.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128658890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Clustered Lifelong Learning Via Representative Task Selection 基于代表性任务选择的聚类终身学习
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00167
Gan Sun, Yang Cong, Yu Kong, Xiaowei Xu
{"title":"Clustered Lifelong Learning Via Representative Task Selection","authors":"Gan Sun, Yang Cong, Yu Kong, Xiaowei Xu","doi":"10.1109/ICDM.2018.00167","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00167","url":null,"abstract":"Consider the lifelong machine learning problem where the objective is to learn new consecutive tasks depending on previously accumulated experiences, i.e., knowledge library. In comparison with most state-of-the-arts which adopt knowledge library with prescribed size, in this paper, we propose a new incremental clustered lifelong learning model with two libraries: feature library and model library, called Clustered Lifelong Learning (CL3), in which the feature library maintains a set of learned features common across all the encountered tasks, and the model library is learned by identifying and adding representative models (clusters). When a new task arrives, the original task model can be firstly reconstructed by representative models measured by capped l2-norm distance, i.e., effectively assigning the new task model to multiple representative models under feature library. Based on this assignment knowledge of new task, the objective of our CL3 model is to transfer the knowledge from both feature library and model library to learn the new task. The new task 1) with a higher outlier probability will then be judged as a new representative, and used to refine both feature library and representative models over time; 2) with lower outlier probability will only update the feature library. For the model optimisation, we cast this problem as an alternating direction minimization problem. To this end, the performance of CL3 is evaluated through comparing with most lifelong learning models, even some batch clustered multi-task learning models.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127037764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Characteristic Subspace Learning for Time Series Classification 时间序列分类的特征子空间学习
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00128
Yuanduo He, Jialiang Pei, Xu Chu, Yasha Wang, Zhu Jin, Guangju Peng
{"title":"Characteristic Subspace Learning for Time Series Classification","authors":"Yuanduo He, Jialiang Pei, Xu Chu, Yasha Wang, Zhu Jin, Guangju Peng","doi":"10.1109/ICDM.2018.00128","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00128","url":null,"abstract":"This paper presents a novel time series classification algorithm. It exploits time-delay embedding to transform time series into a set of points as a distribution, and attempt to classify time series by classifying corresponding distributions. It proposes a novel geometrical feature, i.e. characteristic subspace, from embedding points for classification, and leverages class-weighted support vector machine (SVM) to learn for it. An efficient boosting strategy is also developed to enable a linear time training. The experiments show great potentials of this novel algorithm on accuracy, efficiency and interpretability.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130631833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信