2018 IEEE International Conference on Data Mining (ICDM)最新文献

筛选
英文 中文
Matrix Profile XII: MPdist: A Novel Time Series Distance Measure to Allow Data Mining in More Challenging Scenarios 矩阵概况XII: MPdist:一种新的时间序列距离度量,允许在更具挑战性的场景中进行数据挖掘
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00119
Shaghayegh Gharghabi, Shima Imani, A. Bagnall, Amirali Darvishzadeh, Eamonn J. Keogh
{"title":"Matrix Profile XII: MPdist: A Novel Time Series Distance Measure to Allow Data Mining in More Challenging Scenarios","authors":"Shaghayegh Gharghabi, Shima Imani, A. Bagnall, Amirali Darvishzadeh, Eamonn J. Keogh","doi":"10.1109/ICDM.2018.00119","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00119","url":null,"abstract":"At their core, many time series data mining algorithms can be reduced to reasoning about the shapes of time series subsequences. This requires a distance measure, and most algorithms use Euclidean Distance or Dynamic Time Warping (DTW) as their core subroutine. We argue that these distance measures are not as robust as the community believes. The undue faith in these measures derives from an overreliance on benchmark datasets and self-selection bias. The community is reluctant to address more difficult domains, for which current distance measures are ill-suited. In this work, we introduce a novel distance measure MPdist. We show that our proposed distance measure is much more robust than current distance measures. Furthermore, it allows us to successfully mine datasets that would defeat any Euclidean or DTW distance-based algorithm. Additionally, we show that our distance measure can be computed so efficiently, it allows analytics on fast streams.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123523271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Collective Human Behavior in Cascading System: Discovery, Modeling and Applications 级联系统中的人类集体行为:发现、建模和应用
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00045
Yunfei Lu, Linyun Yu, T. Zhang, Chengxi Zang, Peng Cui, Chaoming Song, Wenwu Zhu
{"title":"Collective Human Behavior in Cascading System: Discovery, Modeling and Applications","authors":"Yunfei Lu, Linyun Yu, T. Zhang, Chengxi Zang, Peng Cui, Chaoming Song, Wenwu Zhu","doi":"10.1109/ICDM.2018.00045","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00045","url":null,"abstract":"The collective behavior, describing spontaneously emerging social processes and events, is ubiquitous in both physical society and online social media. The knowledge of collective behavior is critical in understanding and predicting social movements, fads, riots and so on. However, detecting, quantifying and modeling the collective behavior in online social media at large scale are seldom unexplored. In this paper, we examine a real-world online social media with more than 1.7 million information spreading records, which explicitly document the detailed human behavior in this online information cascading system. We observe evident collective behavior in information cascading, and then propose metrics to quantify the collectivity. We find that previous information cascading models cannot capture the collective behavior in the real-world and thus never utilize it. Furthermore, we propose a generative framework with a latent user interest layer to capture the collective behavior in cascading system. Our framework achieves high accuracy in modeling the information cascades with respect to popularity, structure and collectivity. By leveraging the knowledge of collective behavior, our model shows the capability of making predictions without temporal features or early-stage information. Our framework can serve as a more generalized one in modeling cascading system, and, together with empirical discovery and applications, advance our understanding of human behavior.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126492799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Publisher's Information 出版商的信息
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/icdm.2018.00206
{"title":"Publisher's Information","authors":"","doi":"10.1109/icdm.2018.00206","DOIUrl":"https://doi.org/10.1109/icdm.2018.00206","url":null,"abstract":"","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131391586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interactive Unknowns Recommendation in E-Learning Systems 电子学习系统中的交互式未知推荐
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00065
Shan-Yun Teng, Jundong Li, Lo Pang-Yun Ting, Kun-Ta Chuang, Huan Liu
{"title":"Interactive Unknowns Recommendation in E-Learning Systems","authors":"Shan-Yun Teng, Jundong Li, Lo Pang-Yun Ting, Kun-Ta Chuang, Huan Liu","doi":"10.1109/ICDM.2018.00065","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00065","url":null,"abstract":"The arise of E-learning systems has led to an anytime-anywhere-learning environment for everyone by providing various online courses and tests. However, due to the lack of teacher-student interaction, such ubiquitous learning is generally not as effective as offline classes. In traditional offline courses, teachers facilitate real-time interaction to teach students in accordance with personal aptitude from students' feedback in classes. Without the interruption of instructors, it is difficult for users to be aware of personal unknowns. In this paper, we address an important issue on the exploration of 'user unknowns' from an interactive question-answering process in E-learning systems. A novel interactive learning system, called CagMab, is devised to interactively recommend questions with a round-by-round strategy, which contributes to applications such as a conversational bot for self-evaluation. The flow enables users to discover their weakness and further helps them to progress. In fact, despite its importance, discovering personal unknowns remains a challenging problem in E-learning systems. Even though formulating the problem with the multi-armed bandit framework provides a solution, it often leads to suboptimal results for interactive unknowns recommendation as it simply relies on the contextual features of answered questions. Note that each question is associated with concepts and similar concepts are likely to be linked manually or systematically, which naturally forms the concept graphs. Mining the rich relationships among users, questions and concepts could be potentially helpful in providing better unknowns recommendation. To this end, in this paper, we develop a novel interactive learning framework by borrowing strengths from concept-aware graph embedding for learning user unknowns. Our experimental studies on real data show that the proposed framework can effectively discover user unknowns in an interactive fashion for the recommendation in E-learning systems.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132381686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Unified Theory of the Mobile Sequential Recommendation Problem 移动顺序推荐问题的统一理论
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00189
Zeyang Ye, Keli Xiao, Yuefan Deng
{"title":"A Unified Theory of the Mobile Sequential Recommendation Problem","authors":"Zeyang Ye, Keli Xiao, Yuefan Deng","doi":"10.1109/ICDM.2018.00189","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00189","url":null,"abstract":"A theory is developed to unify the original form, and its many variations, of the mobile sequential recommendation (MSR) problem. The unified theory, expressing the same MSR problem, is superior to the original form in many aspects including a more standardized form. In addition to a newly proposed expected traveling time (ETT) function to measure the quality of recommended routes, we introduce five additional improvements. Also, three essential mathematical properties of the new objective function enable the development of the methods to solve realistic MSR problems with complex conditions. The MSR solutions also support the discovered properties of the proposed objective function. The unified theory should support the long-term decision making for drivers and the traffic department in general.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132468788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Intelligent Salary Benchmarking for Talent Recruitment: A Holistic Matrix Factorization Approach 人才招聘的智能薪酬基准:一种整体矩阵分解方法
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00049
Qingxin Meng, Hengshu Zhu, Keli Xiao, Hui Xiong
{"title":"Intelligent Salary Benchmarking for Talent Recruitment: A Holistic Matrix Factorization Approach","authors":"Qingxin Meng, Hengshu Zhu, Keli Xiao, Hui Xiong","doi":"10.1109/ICDM.2018.00049","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00049","url":null,"abstract":"As a vital process to the success of an organization, salary benchmarking aims at identifying the right market rate for each job position. Traditional approaches for salary benchmarking heavily rely on the experiences from domain experts and limited market survey data, which have difficulties in handling the dynamic scenarios with the timely benchmarking requirement. To this end, in this paper, we propose a data-driven approach for intelligent salary benchmarking based on large-scale fine-grained online recruitment data. Specifically, we first construct a salary matrix based on the large-scale recruitment data and creatively formalize the salary benchmarking problem as a matrix completion task. Along this line, we develop a Holistic Salary Benchmarking Matrix Factorization (HSBMF) model for predicting the missing salary information in the salary matrix. Indeed, by integrating multiple confounding factors, such as company similarity, job similarity, and spatial-temporal similarity, HSBMF is able to provide a holistic and dynamic view for fine-grained salary benchmarking. Finally, extensive experiments on large-scale real-world data clearly validate the effectiveness of our approach for job salary benchmarking.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131159001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
DE-RNN: Forecasting the Probability Density Function of Nonlinear Time Series DE-RNN:预测非线性时间序列的概率密度函数
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00085
K. Yeo, Igor Melnyk, Nam H. Nguyen, Eun Kyung Lee
{"title":"DE-RNN: Forecasting the Probability Density Function of Nonlinear Time Series","authors":"K. Yeo, Igor Melnyk, Nam H. Nguyen, Eun Kyung Lee","doi":"10.1109/ICDM.2018.00085","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00085","url":null,"abstract":"Model-free identification of a nonlinear dynamical system from the noisy observations is of current interest due to its direct relevance to many applications in Industry 4.0. Making a prediction of such noisy time series constitutes a problem of learning the nonlinear time evolution of a probability distribution. Capability of most of the conventional time series models is limited when the underlying dynamics is nonlinear, multi-scale or when there is no prior knowledge at all on the system dynamics. We propose DE-RNN (Density Estimation Recurrent Neural Network) to learn the probability density function (PDF) of a stochastic process with an underlying nonlinear dynamics and compute the time evolution of the PDF for a probabilistic forecast. A Recurrent Neural Network (RNN)-based model is employed to learn a nonlinear operator for the temporal evolution of the stochastic process. We use a softmax layer for a numerical discretization of a smooth PDF, which transforms a function approximation problem to a classification task. A regularized cross-entropy method is introduced to impose a smoothness condition on the estimated probability distribution. A Monte Carlo procedure to compute the temporal evolution of the distribution for a multiple-step forecast is presented. It is shown that the proposed algorithm can learn the nonlinear multi-scale dynamics from the noisy observations and provides an effective tool to forecast time evolution of the underlying probability distribution. Evaluation of the algorithm on three synthetic and two real data sets shows advantage over the compared baselines, and a potential value to a wide range of problems in physics and engineering.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133289966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Pseudo-Implicit Feedback for Alleviating Data Sparsity in Top-K Recommendation 缓解Top-K推荐中数据稀疏性的伪隐式反馈
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00129
Yun He, Haochen Chen, Ziwei Zhu, James Caverlee
{"title":"Pseudo-Implicit Feedback for Alleviating Data Sparsity in Top-K Recommendation","authors":"Yun He, Haochen Chen, Ziwei Zhu, James Caverlee","doi":"10.1109/ICDM.2018.00129","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00129","url":null,"abstract":"We propose PsiRec, a novel user preference propagation recommender that incorporates pseudo-implicit feedback for enriching the original sparse implicit feedback dataset. Three of the unique characteristics of PsiRec are: (i) it views user-item interactions as a bipartite graph and models pseudo-implicit feedback from this perspective; (ii) its random walks-based approach extracts graph structure information from this bipartite graph, toward estimating pseudo-implicit feedback; and (iii) it adopts a Skip-gram inspired measure of confidence in pseudo-implicit feedback that captures the pointwise mutual information between users and items. This pseudo-implicit feedback is ultimately incorporated into a new latent factor model to estimate user preference in cases of extreme sparsity. PsiRec results in improvements of 21.5% and 22.7% in terms of Precision@10 and Recall@10 over state-of-the-art Collaborative Denoising Auto-Encoders. Our implementation is available at https://github.com/heyunh2015/PsiRecICDM2018.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134624331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Feature-Induced Partial Multi-label Learning 特征诱导部分多标签学习
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00192
Guoxian Yu, Xia Chen, C. Domeniconi, J. Wang, Zhao Li, Z. Zhang, Xindong Wu
{"title":"Feature-Induced Partial Multi-label Learning","authors":"Guoxian Yu, Xia Chen, C. Domeniconi, J. Wang, Zhao Li, Z. Zhang, Xindong Wu","doi":"10.1109/ICDM.2018.00192","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00192","url":null,"abstract":"Current efforts on multi-label learning generally assume that the given labels of training instances are noise-free. However, obtaining noise-free labels is quite difficult and often impractical, and the presence of noisy labels may compromise the performance of multi-label learning. Partial multi-label learning (PML) addresses the scenario in which each instance is annotated with a set of candidate labels, of which only a subset corresponds to the ground-truth. The PML problem is more challenging than partial-label learning, since the latter assumes that only one label is valid and may ignore the correlation among candidate labels. To tackle the PML challenge, we introduce a feature induced PML approach called fPML, which simultaneously estimates noisy labels and trains multi-label classifiers. In particular, fPML simultaneously factorizes the observed instance-label association matrix and the instance-feature matrix into low-rank matrices to achieve coherent low-rank matrices from the label and the feature spaces, and a low-rank label correlation matrix as well. The low-rank approximation of the instance-label association matrix is leveraged to estimate the association confidence. To predict the labels of unlabeled instances, fPML learns a matrix that maps the instances to labels based on the estimated association confidence. An empirical study on public multi-label datasets with injected noisy labels, and on archived proteomic datasets, shows that fPML can more accurately identify noisy labels than related solutions, and consequently can achieve better performance on predicting labels of instances than competitive methods.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121863183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
An Integrated Model for Crime Prediction Using Temporal and Spatial Factors 基于时空因素的犯罪预测综合模型
2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI: 10.1109/ICDM.2018.00190
Fei Yi, Zhiwen Yu, Fuzhen Zhuang, X. Zhang, Hui Xiong
{"title":"An Integrated Model for Crime Prediction Using Temporal and Spatial Factors","authors":"Fei Yi, Zhiwen Yu, Fuzhen Zhuang, X. Zhang, Hui Xiong","doi":"10.1109/ICDM.2018.00190","DOIUrl":"https://doi.org/10.1109/ICDM.2018.00190","url":null,"abstract":"Given its importance, crime prediction has attracted a lot of attention in the literature, and several methods have been proposed to discover different aspects of characteristics for crime prediction. In this paper, we propose a Clustered Continuous Conditional Random Field (Clustered-CCRF) model which is able to effectively exploit both spatial and temporal factors for crime prediction in an integrated way. In particular, we observe that the crime number at one specific area is not only conditioned on its own historical records but also has high correlation to crime records from similar areas. Therefore, we propose two factors: an auto-regressed temporal correlation and a feature-based inter-area spatial correlation, to measure such patterns for crime prediction. Further, we present a tree-structured clustering algorithm to discover high similar areas based on spatial characteristics to improve the performance of our proposed model. Experiments on real-world crime dataset demonstrate the superiority of our proposed model over the state-of-the-art methods.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125849480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信