2017 IEEE International Conference on Data Mining (ICDM)最新文献

筛选
英文 中文
Relational Mixture of Experts: Explainable Demographics Prediction with Behavioral Data 专家的关系混合:可解释的人口统计预测与行为数据
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.45
M. Oyamada, S. Nakadai
{"title":"Relational Mixture of Experts: Explainable Demographics Prediction with Behavioral Data","authors":"M. Oyamada, S. Nakadai","doi":"10.1109/ICDM.2017.45","DOIUrl":"https://doi.org/10.1109/ICDM.2017.45","url":null,"abstract":"Given a collection of basic customer demographics (e.g., age and gender) andtheir behavioral data (e.g., item purchase histories), how can we predictsensitive demographics (e.g., income and occupation) that not every customermakes available?This demographics prediction problem is modeled as a classification task inwhich a customer's sensitive demographic y is predicted from his featurevector x. So far, two lines of work have tried to produce a\"good\" feature vector x from the customer's behavioraldata: (1) application-specific feature engineering using behavioral data and (2) representation learning (such as singular value decomposition or neuralembedding) on behavioral data. Although these approaches successfullyimprove the predictive performance, (1) designing a good feature requiresdomain experts to make a great effort and (2) features obtained fromrepresentation learning are hard to interpret. To overcome these problems, we present a Relational Infinite SupportVector Machine (R-iSVM), a mixture-of-experts model that can leveragebehavioral data. Instead of augmenting the feature vectors of customers, R-iSVM uses behavioral data to find out behaviorally similar customerclusters and constructs a local prediction model at each customer cluster. In doing so, R-iSVM successfully improves the predictive performance withoutrequiring application-specific feature designing and hard-to-interpretrepresentations. Experimental results on three real-world datasets demonstrate the predictiveperformance and interpretability of R-iSVM. Furthermore, R-iSVM can co-existwith previous demographics prediction methods to further improve theirpredictive performance.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114803297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Fast Compressive Spectral Clustering 快速压缩光谱聚类
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.120
Tingshu Li, Yiming Zhang, Dongsheng Li, Xinwang Liu, Yuxing Peng
{"title":"Fast Compressive Spectral Clustering","authors":"Tingshu Li, Yiming Zhang, Dongsheng Li, Xinwang Liu, Yuxing Peng","doi":"10.1109/ICDM.2017.120","DOIUrl":"https://doi.org/10.1109/ICDM.2017.120","url":null,"abstract":"Compressive spectral clustering (CSC) efficiently leverages graph filter and random sampling techniques to speed up clustering process. However, we find that CSC algorithm suffers from two main problems: i) The direct use of the dichotomy and eigencount techniques for estimating laplacian matrix’s k-th eigenvalue is expensive. ii) The computation of polynomial approximation repeats in each iteration for every cluster in the interpolation process, which occupies most of the computation time of CSC. To address these problems, we propose a new approach called FCSC for fast compressive spectral clustering. FCSC addresses the first problem by assuming that the eigenvalues approximately satisfy local uniform distribution, and addresses the second problem by recalculating the pairwise similarity between nodes with low-dimensional representation to reconstruct denoised laplacian matrix. The time complexity of reconstruction is linear with the number of non-zeros in laplacian matrix. As experimentally demonstrated on artificial and real-world datasets, our approach significantly reduces the computation time while preserving high clustering accuracy comparable to previous designs, verifying the effectiveness of FCSC.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122176977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
LCD: A Fast Contrastive Divergence Based Algorithm for Restricted Boltzmann Machine 一种基于快速对比发散的受限玻尔兹曼机算法
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.131
Lin Ning, Randall Pittman, Xipeng Shen
{"title":"LCD: A Fast Contrastive Divergence Based Algorithm for Restricted Boltzmann Machine","authors":"Lin Ning, Randall Pittman, Xipeng Shen","doi":"10.1109/ICDM.2017.131","DOIUrl":"https://doi.org/10.1109/ICDM.2017.131","url":null,"abstract":"Restricted Boltzmann Machine (RBM) is the building block of Deep Belief Nets and other deep learning tools. Fast learning and prediction are both essential for practical usage of RBM-based machine learning techniques. This paper proposes Lean Contrastive Divergence (LCD), a modified Contrastive Divergence (CD) algorithm, to accelerate RBM learning and prediction without changing the results. LCD avoids most of the required computations with two optimization techniques. The first is called bounds-based filtering, which, through triangle inequality, replaces expensive calculations of many vector dot products with fast bounds calculations. The second is delta product, which effectively detects and avoids many repeated calculations in the core operation of RBM, Gibbs Sampling. The optimizations are applicable to both the standard contrastive divergence learning algorithm and its variations. Results show that the optimizations can produce several-fold (up to 3X for training and 5.3X for prediction) speedups.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117167354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Statistical Link Label Modeling for Sign Prediction: Smoothing Sparsity by Joining Local and Global Information 用于符号预测的统计链接标签建模:通过连接局部和全局信息平滑稀疏性
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.135
Amin Javari, Hongxiang Qiu, Elham Barzegaran, M. Jalili, K. Chang
{"title":"Statistical Link Label Modeling for Sign Prediction: Smoothing Sparsity by Joining Local and Global Information","authors":"Amin Javari, Hongxiang Qiu, Elham Barzegaran, M. Jalili, K. Chang","doi":"10.1109/ICDM.2017.135","DOIUrl":"https://doi.org/10.1109/ICDM.2017.135","url":null,"abstract":"One of the major issues in signed networks is to use network structure to predict the missing sign of an edge. In this paper, we introduce a novel probabilistic approach for the sign prediction problem. The main characteristic of the proposed models is their ability to adapt to the sparsity level of an input network. Building a model that has an ability to adapt to the sparsity of the data has not yet been considered in the previous related works. We suggest that there exists a dilemma between local and global structures and attempt to build sparsity adaptive models by resolving this dilemma. To this end, we propose probabilistic prediction models based on local and global structures and integrate them based on the concept of smoothing. The model relies more on the global structures when the sparsity increases, whereas it gives more weights to the information obtained from local structures for low levels of the sparsity. The proposed model is assessed on three real-world signed networks, and the experiments reveal its consistent superiority over the state of the art methods. As compared to the previous methods, the proposed model not only better handles the sparsity problem, but also has lower computational complexity and can be updated using real-time data streams.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":" 656","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120829290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
iNEAT: Incomplete Network Alignment iNEAT:不完全网络对齐
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.160
Si Zhang, Hanghang Tong, Jie Tang, Jiejun Xu, Wei Fan
{"title":"iNEAT: Incomplete Network Alignment","authors":"Si Zhang, Hanghang Tong, Jie Tang, Jiejun Xu, Wei Fan","doi":"10.1109/ICDM.2017.160","DOIUrl":"https://doi.org/10.1109/ICDM.2017.160","url":null,"abstract":"Network alignment and network completion are two fundamental cornerstones behind many high-impact graph mining applications. The state-of-the-arts have been addressing these tasks in parallel. In this paper, we argue that network alignment and completion are inherently complementary with each other, and hence propose to jointly address them so that the two tasks can benefit from each other. We formulate it from the optimization perspective, and propose an effective algorithm iNEAT to solve it. The proposed method offers two distinctive advantages. First (Alignment accuracy), our method benefits from higher-quality input networks while mitigates the effect of incorrectly inferred links introduced by the completion task itself. Second (Alignment efficiency), thanks to the low-rank structure of the complete networks and alignment matrix, the alignment can be significantly accelerated. The extensive experiments demonstrate the performance of our algorithm.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121752251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Aspect Sentiment Model for Micro Reviews 面向微评论的面向情感模型
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.83
Reinald Kim Amplayo, Seung-won Hwang
{"title":"Aspect Sentiment Model for Micro Reviews","authors":"Reinald Kim Amplayo, Seung-won Hwang","doi":"10.1109/ICDM.2017.83","DOIUrl":"https://doi.org/10.1109/ICDM.2017.83","url":null,"abstract":"This paper aims at an aspect sentiment model for aspect-based sentiment analysis (ABSA) focused on micro reviews. This task is important in order to understand short reviews majority of the users write, while existing topic models are targeted for expert-level long reviews with sufficient co-occurrence patterns to observe. Current methods on aggregating micro reviews using metadata information may not be effective as well due to metadata absence, topical heterogeneity, and cold start problems. To this end, we propose a model called Micro Aspect Sentiment Model (MicroASM). MicroASM is based on the observation that short reviews 1) are viewed with sentiment-aspect word pairs as building blocks of information, and 2) can be clustered into larger reviews. When compared to the current state-of-the-art aspect sentiment models, experiments show that our model provides better performance on aspect-level tasks such as aspect term extraction and document-level tasks such as sentiment classification.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127650352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Informing the Use of Hyperparameter Optimization Through Metalearning 通过元学习来通知超参数优化的使用
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.137
Samantha Sanders, C. Giraud-Carrier
{"title":"Informing the Use of Hyperparameter Optimization Through Metalearning","authors":"Samantha Sanders, C. Giraud-Carrier","doi":"10.1109/ICDM.2017.137","DOIUrl":"https://doi.org/10.1109/ICDM.2017.137","url":null,"abstract":"One of the challenges of data mining is finding hyperparameters for a learning algorithm that will produce the best model for a given dataset. Hyperparameter optimization automates this process, but it can still take significant time. It has been found that hyperparameter optimization does not always result in induced models with significant improvement over default values, yet no systematic analysis of the role of hyperparameter optimization in machine learning has been conducted. We use metalearning to inform the decision of whether to optimize hyperparameters based on expected performance improvement and computational cost.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"29 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127704814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
BiCycle: Item Recommendation with Life Cycles 自行车:有生命周期的项目推荐
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.39
Xinyue Liu, Y. Song, C. Aggarwal, Yao Zhang, Xiangnan Kong
{"title":"BiCycle: Item Recommendation with Life Cycles","authors":"Xinyue Liu, Y. Song, C. Aggarwal, Yao Zhang, Xiangnan Kong","doi":"10.1109/ICDM.2017.39","DOIUrl":"https://doi.org/10.1109/ICDM.2017.39","url":null,"abstract":"Recommender systems have attracted much attention in last decades, which can help the users explore new items in many applications. As a popular technique in recommender systems, item recommendation works by recommending items to users based on their historical interactions. Conventional item recommendation methods usually assume that users and items are stationary, which is not always the case in real-world applications. Many time-aware item recommendation models have been proposed to take the temporal effects into the considerations based on the absolute time stamps associated with observed interactions. We show that using absolute time to model temporal effects can be limited in some circumstances. In this work, we propose to model the temporal dynamics of both users and items in item recommendation based on their life cycles. This problem is very challenging to solve since the users and items can co-evolve in their life cycles and the sparseness of the data become more severe when we consider the life cycles of both users and items. A novel time-aware item recommendation model called BiCycle is proposed to address these challenges. BiCycle is designed based on two important observations: 1) correlated users or items usually share similar patterns in the similar stages of their life cycles. 2) user preferences and item characters can evolve gradually over different stages of their life cycles. Extensive experiments conducted on three real-world datasets demonstrate the proposed approach can significantly improve the performance of recommendation tasks by considering the inner life cycles of both users and items.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125650081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
MDL for Causal Inference on Discrete Data 离散数据因果推理的MDL
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.87
Kailash Budhathoki, Jilles Vreeken
{"title":"MDL for Causal Inference on Discrete Data","authors":"Kailash Budhathoki, Jilles Vreeken","doi":"10.1109/ICDM.2017.87","DOIUrl":"https://doi.org/10.1109/ICDM.2017.87","url":null,"abstract":"The algorithmic Markov condition states that the most likely causal direction between two random variables X and Y can be identified as the direction with the lowest Kolmogorov complexity. This notion is very powerful as it can detect any causal dependency that can be explained by a physical process. However, due to the halting problem, it is also not computable. In this paper we propose an computable instantiation that provably maintains the key aspects of the ideal. We propose to approximate Kolmogorov complexity via the Minimum Description Length (MDL) principle, using a score that is mini-max optimal with regard to the model class under consideration. This means that even in an adversarial setting, the score degrades gracefully, and we are still maximally able to detect dependencies between the marginal and the conditional distribution. As a proof of concept, we propose CISC, a linear-time algorithm for causal inference by stochastic complexity, for pairs of univariate discrete variables. Experiments show that CISC is highly accurate on synthetic, benchmark, as well as real-world data, outperforming the state of the art by a margin, and scales extremely well with regard to sample and domain sizes.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121479877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
A Probabilistic Approach for Learning with Label Proportions Applied to the US Presidential Election 标签比例的概率学习方法在美国总统选举中的应用
2017 IEEE International Conference on Data Mining (ICDM) Pub Date : 2017-11-01 DOI: 10.1109/ICDM.2017.54
Tao Sun, D. Sheldon, Brendan OConnor
{"title":"A Probabilistic Approach for Learning with Label Proportions Applied to the US Presidential Election","authors":"Tao Sun, D. Sheldon, Brendan OConnor","doi":"10.1109/ICDM.2017.54","DOIUrl":"https://doi.org/10.1109/ICDM.2017.54","url":null,"abstract":"Ecological inference (EI) is a classical problem from political science to model voting behavior of individuals given only aggregate election results. Flaxman et al. recently formulated EI as machine learning problem using distribution regression, and applied it to analyze US presidential elections. However, distribution regression unnecessarily aggregates individual-level covariates available from census microdata, and ignores known structure of the aggregation mechanism. We instead formulate the problem as learning with label proportions (LLP), and develop a new, probabilistic, LLP method to solve it. Our model is the straightforward one where individual votes are latent variables. We use cardinality potentials to efficiently perform exact inference over latent variables during learning, and introduce a novel message-passing algorithm to extend cardinality potentials to multivariate probability models for use within multiclass LLP problems. We show experimentally that LLP outperforms distribution regression for predicting individual-level attributes, and that our method is as good as or better than existing state-of-the-art LLP methods.","PeriodicalId":254086,"journal":{"name":"2017 IEEE International Conference on Data Mining (ICDM)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132164103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信