2017 IEEE International Conference on Data Mining Workshops (ICDMW)最新文献_第4页

A Machine Learning Approach to Non-uniform Spatial Downscaling of Climate Variables 气候变量非均匀空间降尺度的机器学习方法

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.49

Soukayna Mouatadid, S. Easterbrook, A. Erler

引用次数: 13

An CNN-LSTM Attention Approach to Understanding User Query Intent from Online Health Communities 基于CNN-LSTM的在线健康社区用户查询意图理解方法

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.62

Ruichu Cai, Binjun Zhu, Lei Ji, Tianyong Hao, Jun Yan, Wenyin Liu

{"title":"An CNN-LSTM Attention Approach to Understanding User Query Intent from Online Health Communities","authors":"Ruichu Cai, Binjun Zhu, Lei Ji, Tianyong Hao, Jun Yan, Wenyin Liu","doi":"10.1109/ICDMW.2017.62","DOIUrl":"https://doi.org/10.1109/ICDMW.2017.62","url":null,"abstract":"Understanding user query intent is a crucial task to Question-Answering area. With the development of online health services, online health communities generate huge amount of valuable medical Question-Answering data, where user intention can be mined. However, the queries posted by common users have many domain concepts and colloquial expressions, which make the understanding of user intents very difficult. In this paper, we try to find and predict user intent from the realistic medical text queries. A CNN-LSTM attention model is proposed to predict user intents, and an unsupervised clustering method is applied to mine user intent taxonomy. The CNN-LSTM attention model has a CNN encoders and a Bi-LSTM attention encoder. The two encoder can capture both of global semantic expression and local phrase-level information from an original medical text query, which helps the intent prediction. We also utilize extra knowledge like part-of-speech tags and named entity tags to enrich feature information. Based on the experiments on a health community query intent(HCQI) dataset, we compare our model with baseline models and experiment results demonstrate the effectiveness of our model.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"478 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116527265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 39

Improving Multivariate Time Series Forecasting with Random Walks with Restarts on Causality Graphs 因果图上带重启随机游走的多元时间序列预测改进

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.127

Piotr Przymus, Youssef Hmamouche, Alain Casali, L. Lakhal

引用次数: 11

Dependency Anomaly Detection for Heterogeneous Time Series: A Granger-Lasso Approach 异构时间序列的依赖异常检测:一种Granger-Lasso方法

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.155

Sahar Behzadi, K. Hlaváčková-Schindler, C. Plant

{"title":"Dependency Anomaly Detection for Heterogeneous Time Series: A Granger-Lasso Approach","authors":"Sahar Behzadi, K. Hlaváčková-Schindler, C. Plant","doi":"10.1109/ICDMW.2017.155","DOIUrl":"https://doi.org/10.1109/ICDMW.2017.155","url":null,"abstract":"The special characteristics of time series data, such as their high dimensionality and complex dependencies between variables make the problem of detecting anomalies in time series very challenging. Anomalies and more precisely dependency anomalies ensue from the temporal causal depen-dencies. Furthermore the graphical Granger causal models provide an appropriate environment to capture all the temporal dependencies in Gaussian time series. However many production systems are characterized by a high degree of complex stochastic processes consisting of heterogeneous time series. Considering this situation discovery of dependency anomalies would be more challenging since almost all the current algorithms are dealing with the homogeneous cases. Granger-Lasso algorithm is a well-known L1 penalization algorithm which copes with the temporal causality detection only for Gaussian time series. Inspired by this algorithm and considering the incremental heterogeneous time series generated in many different industries, we propose a modification for Granger-Lasso algorithm in the sense that it would be applicable for a larger class of heterogeneous time series. To introduce this algorithm we are motivated by generalized linear models. Moreover based on the proposed algorithm for discovery temporal dependencies we introduce its application in anomaly detection considering time series followed by distributions from exponential family, e.g. Poisson, binomial or multinomial distribution. The Granger-Lasso procedure is solved by using least square cost function with Lasso penalty for appropriately transformed input time series. The experimental results illustrate the performance and efficiency of the proposed algorithm on the synthetic and other datasets. We evaluated the proposed method on causality testing on different examples.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"54 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114091541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

High Performance Graph Data Management and Mining with X10 高性能图形数据管理和挖掘与X10

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.135

Miyuru Dayarathna

引用次数: 0

Meta-Morisita Index: Anomaly Behaviour Detection for Large Scale Tracking Data with Spatio-Temporal Marks Meta-Morisita索引:带时空标记的大规模跟踪数据异常行为检测

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.95

Zhao Yang, N. Japkowicz

引用次数: 4

Steered Microaggregation: A Unified Primitive for Anonymization of Data Sets and Data Streams 导向微聚合:数据集和数据流匿名化的统一原语

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.141

J. Domingo-Ferrer, Jordi Soria-Comas

引用次数: 13

A Feasible Direction Method for Optimization Problem with Orthogonal Constraint in Feature Selection 特征选择中正交约束优化问题的可行方向方法

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.114

Jianyu Miao, Yong Shi, Lingfeng Niu

{"title":"A Feasible Direction Method for Optimization Problem with Orthogonal Constraint in Feature Selection","authors":"Jianyu Miao, Yong Shi, Lingfeng Niu","doi":"10.1109/ICDMW.2017.114","DOIUrl":"https://doi.org/10.1109/ICDMW.2017.114","url":null,"abstract":"Feature selection, as a fundamental component of building robust models, plays an important role in many machine learning and data mining tasks. Since acquiring labeled data is particularly expensive in both time and effort, unsupervised feature selection on unlabeled data has recently gained considerable attention. Without label information, unsupervised feature selection needs alternative criteria to define feature relevance. We propose a novel unsupervised feature selection model, which embeds feature selection into nonnegative spectral clustering. A tailored optimization algorithm based on Alternating Direction Method of Multipliers (ADMM) is designed to solve the proposed model. Many previous unsupervised feature selection methods used singular value decompose (SVD) to handle the subproblem with orthogonal constraint. Generally, the scale of the matrix in feature selection is significantly big, the computation of SVD will be very slow or even infeasible. To address this issue, we propose to use a feasible direction method to efficiently solve the subproblem with orthogonal constraint. The experimental study shows that we can obtain better performance compared with the state of the art methods.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129280638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Personalized Anonymization for Set-Valued Data by Partial Suppression 集值数据的部分抑制个性化匿名化

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.142

Takuma Nakagawa, Hiromi Arai, Hiroshi Nakagawa

{"title":"Personalized Anonymization for Set-Valued Data by Partial Suppression","authors":"Takuma Nakagawa, Hiromi Arai, Hiroshi Nakagawa","doi":"10.1109/ICDMW.2017.142","DOIUrl":"https://doi.org/10.1109/ICDMW.2017.142","url":null,"abstract":"Set-valued data is comprised of records that are sets of items, such as goods purchased by each individual. Methods of publishing and widely utilizing set-valued data while protecting personal information have been extensively studied in the field of privacy-preserving data publishing. Until now, basic models such as k-anonymity or km-anonymity could not cope with attribute inference by an adversary with background knowledge of the records. On the other hand, the ρ-uncertainty model makes it possible to prevent attribute inference with a confidence value above a certain level in set-valued data. However, even in that case, there is the problem that items to be protected have to be designated in advance. In this research, we propose a new model that can provide more suitable privacy protection for each individual by protecting different items designated for each record distinctively and build a heuristic algorithm to achieve this guarantee using partial suppression. In addition, considering the problem that the computational complexity of the algorithm increases combinatorially with increasing data size, we introduce the concept of probabilistic relaxation of privacy guarantee. Finally, we show the experimental results of evaluating the performance of the algorithms using real-world datasets.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128005357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Exploring Transfer Learning for Crime Prediction 探索犯罪预测的迁移学习

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI: 10.1109/ICDMW.2017.165

Xiangyu Zhao, Jiliang Tang

{"title":"Exploring Transfer Learning for Crime Prediction","authors":"Xiangyu Zhao, Jiliang Tang","doi":"10.1109/ICDMW.2017.165","DOIUrl":"https://doi.org/10.1109/ICDMW.2017.165","url":null,"abstract":"Crime prediction plays a crucial role in addressing crime, violence, conflict and insecurity in cities to promote good governance, appropriate urban planning and management. Plenty efforts have been made on developing crime prediction models by leveraging demographic data, but they failed to capture the dynamic nature of crimes in urban. Recently, with the development of new techniques for collecting and integrating fine-grained crime-related datasets, there is a potential to obtain better understandings about the dynamics of crimes and advance crime prediction. However, for a city, it is hard to build a uniform framework for all boroughs due to the uneven distribution of data. To this end, in this paper, we exploit spatio-temporal patterns in urban data in one borough in a city, and then leverage transfer learning techniques to reinforce the crime prediction of other boroughs. Specifically, we first validate the existence of spatio-temporal patterns in urban crime. Then we extract the crime-related features from cross-domain datasets. Finally we propose a novel transfer learning framework to integrate these features and model spatio-temporal patterns for crime prediction.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115960258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 29