2018 IEEE International Conference on Data Mining (ICDM)最新文献_第2页

Matrix Profile XII: MPdist: A Novel Time Series Distance Measure to Allow Data Mining in More Challenging Scenarios 矩阵概况XII: MPdist:一种新的时间序列距离度量，允许在更具挑战性的场景中进行数据挖掘

2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00119

Shaghayegh Gharghabi, Shima Imani, A. Bagnall, Amirali Darvishzadeh, Eamonn J. Keogh

引用次数: 33

Collective Human Behavior in Cascading System: Discovery, Modeling and Applications 级联系统中的人类集体行为:发现、建模和应用

2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00045

Yunfei Lu, Linyun Yu, T. Zhang, Chengxi Zang, Peng Cui, Chaoming Song, Wenwu Zhu

{"title":"Collective Human Behavior in Cascading System: Discovery, Modeling and Applications","authors":"Yunfei Lu, Linyun Yu, T. Zhang, Chengxi Zang, Peng Cui, Chaoming Song, Wenwu Zhu","doi":"10.1109/ICDM.2018.00045","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00045","url":null,"abstract":"The collective behavior, describing spontaneously emerging social processes and events, is ubiquitous in both physical society and online social media. The knowledge of collective behavior is critical in understanding and predicting social movements, fads, riots and so on. However, detecting, quantifying and modeling the collective behavior in online social media at large scale are seldom unexplored. In this paper, we examine a real-world online social media with more than 1.7 million information spreading records, which explicitly document the detailed human behavior in this online information cascading system. We observe evident collective behavior in information cascading, and then propose metrics to quantify the collectivity. We find that previous information cascading models cannot capture the collective behavior in the real-world and thus never utilize it. Furthermore, we propose a generative framework with a latent user interest layer to capture the collective behavior in cascading system. Our framework achieves high accuracy in modeling the information cascades with respect to popularity, structure and collectivity. By leveraging the knowledge of collective behavior, our model shows the capability of making predictions without temporal features or early-stage information. Our framework can serve as a more generalized one in modeling cascading system, and, together with empirical discovery and applications, advance our understanding of human behavior.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126492799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Publisher's Information 出版商的信息

2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/icdm.2018.00206

引用次数: 0

Interactive Unknowns Recommendation in E-Learning Systems 电子学习系统中的交互式未知推荐

2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00065

Shan-Yun Teng, Jundong Li, Lo Pang-Yun Ting, Kun-Ta Chuang, Huan Liu

{"title":"Interactive Unknowns Recommendation in E-Learning Systems","authors":"Shan-Yun Teng, Jundong Li, Lo Pang-Yun Ting, Kun-Ta Chuang, Huan Liu","doi":"10.1109/ICDM.2018.00065","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00065","url":null,"abstract":"The arise of E-learning systems has led to an anytime-anywhere-learning environment for everyone by providing various online courses and tests. However, due to the lack of teacher-student interaction, such ubiquitous learning is generally not as effective as offline classes. In traditional offline courses, teachers facilitate real-time interaction to teach students in accordance with personal aptitude from students' feedback in classes. Without the interruption of instructors, it is difficult for users to be aware of personal unknowns. In this paper, we address an important issue on the exploration of 'user unknowns' from an interactive question-answering process in E-learning systems. A novel interactive learning system, called CagMab, is devised to interactively recommend questions with a round-by-round strategy, which contributes to applications such as a conversational bot for self-evaluation. The flow enables users to discover their weakness and further helps them to progress. In fact, despite its importance, discovering personal unknowns remains a challenging problem in E-learning systems. Even though formulating the problem with the multi-armed bandit framework provides a solution, it often leads to suboptimal results for interactive unknowns recommendation as it simply relies on the contextual features of answered questions. Note that each question is associated with concepts and similar concepts are likely to be linked manually or systematically, which naturally forms the concept graphs. Mining the rich relationships among users, questions and concepts could be potentially helpful in providing better unknowns recommendation. To this end, in this paper, we develop a novel interactive learning framework by borrowing strengths from concept-aware graph embedding for learning user unknowns. Our experimental studies on real data show that the proposed framework can effectively discover user unknowns in an interactive fashion for the recommendation in E-learning systems.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132381686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

A Unified Theory of the Mobile Sequential Recommendation Problem 移动顺序推荐问题的统一理论

2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00189

Zeyang Ye, Keli Xiao, Yuefan Deng

引用次数: 10

Intelligent Salary Benchmarking for Talent Recruitment: A Holistic Matrix Factorization Approach 人才招聘的智能薪酬基准:一种整体矩阵分解方法

2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00049

Qingxin Meng, Hengshu Zhu, Keli Xiao, Hui Xiong

{"title":"Intelligent Salary Benchmarking for Talent Recruitment: A Holistic Matrix Factorization Approach","authors":"Qingxin Meng, Hengshu Zhu, Keli Xiao, Hui Xiong","doi":"10.1109/ICDM.2018.00049","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00049","url":null,"abstract":"As a vital process to the success of an organization, salary benchmarking aims at identifying the right market rate for each job position. Traditional approaches for salary benchmarking heavily rely on the experiences from domain experts and limited market survey data, which have difficulties in handling the dynamic scenarios with the timely benchmarking requirement. To this end, in this paper, we propose a data-driven approach for intelligent salary benchmarking based on large-scale fine-grained online recruitment data. Specifically, we first construct a salary matrix based on the large-scale recruitment data and creatively formalize the salary benchmarking problem as a matrix completion task. Along this line, we develop a Holistic Salary Benchmarking Matrix Factorization (HSBMF) model for predicting the missing salary information in the salary matrix. Indeed, by integrating multiple confounding factors, such as company similarity, job similarity, and spatial-temporal similarity, HSBMF is able to provide a holistic and dynamic view for fine-grained salary benchmarking. Finally, extensive experiments on large-scale real-world data clearly validate the effectiveness of our approach for job salary benchmarking.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131159001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

DE-RNN: Forecasting the Probability Density Function of Nonlinear Time Series DE-RNN:预测非线性时间序列的概率密度函数

2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00085

K. Yeo, Igor Melnyk, Nam H. Nguyen, Eun Kyung Lee

{"title":"DE-RNN: Forecasting the Probability Density Function of Nonlinear Time Series","authors":"K. Yeo, Igor Melnyk, Nam H. Nguyen, Eun Kyung Lee","doi":"10.1109/ICDM.2018.00085","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00085","url":null,"abstract":"Model-free identification of a nonlinear dynamical system from the noisy observations is of current interest due to its direct relevance to many applications in Industry 4.0. Making a prediction of such noisy time series constitutes a problem of learning the nonlinear time evolution of a probability distribution. Capability of most of the conventional time series models is limited when the underlying dynamics is nonlinear, multi-scale or when there is no prior knowledge at all on the system dynamics. We propose DE-RNN (Density Estimation Recurrent Neural Network) to learn the probability density function (PDF) of a stochastic process with an underlying nonlinear dynamics and compute the time evolution of the PDF for a probabilistic forecast. A Recurrent Neural Network (RNN)-based model is employed to learn a nonlinear operator for the temporal evolution of the stochastic process. We use a softmax layer for a numerical discretization of a smooth PDF, which transforms a function approximation problem to a classification task. A regularized cross-entropy method is introduced to impose a smoothness condition on the estimated probability distribution. A Monte Carlo procedure to compute the temporal evolution of the distribution for a multiple-step forecast is presented. It is shown that the proposed algorithm can learn the nonlinear multi-scale dynamics from the noisy observations and provides an effective tool to forecast time evolution of the underlying probability distribution. Evaluation of the algorithm on three synthetic and two real data sets shows advantage over the compared baselines, and a potential value to a wide range of problems in physics and engineering.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133289966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Pseudo-Implicit Feedback for Alleviating Data Sparsity in Top-K Recommendation 缓解Top-K推荐中数据稀疏性的伪隐式反馈

2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00129

Yun He, Haochen Chen, Ziwei Zhu, James Caverlee

引用次数: 6

Feature-Induced Partial Multi-label Learning 特征诱导部分多标签学习

2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00192

Guoxian Yu, Xia Chen, C. Domeniconi, J. Wang, Zhao Li, Z. Zhang, Xindong Wu

{"title":"Feature-Induced Partial Multi-label Learning","authors":"Guoxian Yu, Xia Chen, C. Domeniconi, J. Wang, Zhao Li, Z. Zhang, Xindong Wu","doi":"10.1109/ICDM.2018.00192","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00192","url":null,"abstract":"Current efforts on multi-label learning generally assume that the given labels of training instances are noise-free. However, obtaining noise-free labels is quite difficult and often impractical, and the presence of noisy labels may compromise the performance of multi-label learning. Partial multi-label learning (PML) addresses the scenario in which each instance is annotated with a set of candidate labels, of which only a subset corresponds to the ground-truth. The PML problem is more challenging than partial-label learning, since the latter assumes that only one label is valid and may ignore the correlation among candidate labels. To tackle the PML challenge, we introduce a feature induced PML approach called fPML, which simultaneously estimates noisy labels and trains multi-label classifiers. In particular, fPML simultaneously factorizes the observed instance-label association matrix and the instance-feature matrix into low-rank matrices to achieve coherent low-rank matrices from the label and the feature spaces, and a low-rank label correlation matrix as well. The low-rank approximation of the instance-label association matrix is leveraged to estimate the association confidence. To predict the labels of unlabeled instances, fPML learns a matrix that maps the instances to labels based on the estimated association confidence. An empirical study on public multi-label datasets with injected noisy labels, and on archived proteomic datasets, shows that fPML can more accurately identify noisy labels than related solutions, and consequently can achieve better performance on predicting labels of instances than competitive methods.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121863183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 59

An Integrated Model for Crime Prediction Using Temporal and Spatial Factors 基于时空因素的犯罪预测综合模型

2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00190

Fei Yi, Zhiwen Yu, Fuzhen Zhuang, X. Zhang, Hui Xiong

引用次数: 42