2013 IEEE 13th International Conference on Data Mining最新文献

筛选
英文 中文
Communication-Efficient Distributed Multiple Reference Pattern Matching for M2M Systems 面向M2M系统的高效通信分布式多参考模式匹配
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.161
Jui-Pin Wang, Yu-Chen Lu, Mi-Yen Yeh, Shou-de Lin, Phillip B. Gibbons
{"title":"Communication-Efficient Distributed Multiple Reference Pattern Matching for M2M Systems","authors":"Jui-Pin Wang, Yu-Chen Lu, Mi-Yen Yeh, Shou-de Lin, Phillip B. Gibbons","doi":"10.1109/ICDM.2013.161","DOIUrl":"https://doi.org/10.1109/ICDM.2013.161","url":null,"abstract":"In M2M applications, it is very common to encounter the ad hoc snapshot query that requires fast responses from many local machines in which all the data are distributed. In the scenario when the query is more complex, the communication cost for sending it to all the local machines for processing can be very high. This paper aims to address this issue. Given a reference set of multiple and large-size patterns, we propose an approach to identifying its k nearest and farthest neighbors globally across all the local machines. By decomposing the reference patterns into a multi-resolution representation and using novel distance bound designs, our method guarantees the exact results in a communication-efficient manner. Analytical and empirical studies show that our method outperforms the state-of-the-art methods in saving significant bandwidth usage, especially for large numbers of machines and large-sized reference patterns.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134112850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Parameter-Free Spatio-Temporal Pattern Mining Model to Catalog Global Ocean Dynamics 面向全球海洋动力学目录的无参数时空模式挖掘模型
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.162
James H. Faghmous, M. Le, Muhammed Uluyol, Vipin Kumar, Snigdhansu Chatterjee
{"title":"A Parameter-Free Spatio-Temporal Pattern Mining Model to Catalog Global Ocean Dynamics","authors":"James H. Faghmous, M. Le, Muhammed Uluyol, Vipin Kumar, Snigdhansu Chatterjee","doi":"10.1109/ICDM.2013.162","DOIUrl":"https://doi.org/10.1109/ICDM.2013.162","url":null,"abstract":"As spatio-temporal data have become ubiquitous, an increasing challenge facing computer scientists is that of identifying discrete patterns in continuous spatio-temporal fields. In this paper, we introduce a parameter-free pattern mining application that is able to identify dynamic anomalies in ocean data, known as ocean eddies. Despite ocean eddy monitoring being an active field of research, we provide one of the first quantitative analyses of the performance of the most used monitoring algorithms. We present an incomplete information validation technique, that uses the performance of two methods to construct an imperfect ground truth to test the significance of patterns discovered as well as the relative performance of pattern mining algorithms. These methods, in addition to the validation schemes discussed provide researchers new directions in analyzing large unlabeled climate datasets.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133196726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
An Efficient Approach to Updating Closeness Centrality and Average Path Length in Dynamic Networks 动态网络中接近度、中心性和平均路径长度的一种有效更新方法
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.135
Chia-Chen Yen, Mi-Yen Yeh, Ming-Syan Chen
{"title":"An Efficient Approach to Updating Closeness Centrality and Average Path Length in Dynamic Networks","authors":"Chia-Chen Yen, Mi-Yen Yeh, Ming-Syan Chen","doi":"10.1109/ICDM.2013.135","DOIUrl":"https://doi.org/10.1109/ICDM.2013.135","url":null,"abstract":"Closeness centrality measures the communication efficiency of a specific vertex within a network while the average path length (APL) measures that of the whole network. Since the nature of these two measurements is based on the computation of all-pair shortest path distances, one can perform the breadth-first search method starting at every vertex and obtain the two measurements. However, as the edge counts in the real-world networks like Facebook increase over time, this naive way is obviously inefficient. In this paper, we proposed CENDY, an efficient approach to updating Closeness centrality and average path length in Dynamic networks when there is an edge insertion or deletion. In CENDY, we derived some theoretical properties to quickly identify a set of vertices whose shortest path changed after an edge update, and then update the closeness centralities of those vertices only as well as the APL of the graph by a few of single-source shortest path computations. We conducted extensive experiments to show that, when compared to the existing methods of computing exact or approximate values, CENDY outperformed others in significantly low update time while providing exact values of the two measurements on various real-world graph datasets.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129498814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Beyond Boolean Matrix Decompositions: Toward Factor Analysis and Dimensionality Reduction of Ordinal Data 超越布尔矩阵分解:迈向有序数据的因子分析与降维
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.127
R. Belohlávek, Markéta Krmelová
{"title":"Beyond Boolean Matrix Decompositions: Toward Factor Analysis and Dimensionality Reduction of Ordinal Data","authors":"R. Belohlávek, Markéta Krmelová","doi":"10.1109/ICDM.2013.127","DOIUrl":"https://doi.org/10.1109/ICDM.2013.127","url":null,"abstract":"Boolean matrix factorization (BMF), or decomposition, received a considerable attention in data mining research. In this paper, we argue that research should extend beyond the Boolean case toward more general type of data such as ordinal data. Technically, such extension amounts to replacement of the two-element Boolean algebra utilized in BMF by more general structures, which brings non-trivial challenges. We first present the problem formulation, survey the existing literature, and provide an illustrative example. Second, we present new theorems regarding decompositions of matrices with ordinal data. Third, we propose a new algorithm based on these results along with an experimental evaluation.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131599509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Regularization Paths for Sparse Nonnegative Least Squares Problems with Applications to Life Cycle Assessment Tree Discovery 稀疏非负最小二乘问题的正则化路径及其在生命周期评估树发现中的应用
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.125
Jingu Kim, Naren Ramakrishnan, M. Marwah, Amip Shah, Haesun Park
{"title":"Regularization Paths for Sparse Nonnegative Least Squares Problems with Applications to Life Cycle Assessment Tree Discovery","authors":"Jingu Kim, Naren Ramakrishnan, M. Marwah, Amip Shah, Haesun Park","doi":"10.1109/ICDM.2013.125","DOIUrl":"https://doi.org/10.1109/ICDM.2013.125","url":null,"abstract":"The nonnegative least squares problems are useful in applications where the physical nature of problem domain permits only additive linear combinations. We discuss the l1-regularized nonnegative least squares (L1-NLS) problem, where l1-regularization is used to induce sparsity. Although l1-regularization has been successfully used in least squares regression, when combined with nonnegativity constraints, developments of algorithms and their understandings have been limited. We propose an algorithm that generates the entire regularization paths of the L1-NLS problem. We prove the correctness of the proposed algorithm and illustrate a novel application in environmental sustainability. The application relates to life cycle assessment (LCA), a technique used to estimate environmental impact during the entire lifetime of a product. We address an inverse problem in LCA. Given environmental impact factors of a target product and of a large library of constituents, the goal is to reverse engineer an inventory tree for the product. Using real-world data sets, we demonstrate how our L1-NLS approach controls the size of discovered trees, and how the full regularization paths effectively illustrate the spectrum of discovered trees with varying sparsity and compositions.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115175754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Structural-Context Similarities for Uncertain Graphs 不确定图的结构-上下文相似性
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.22
Zhaonian Zou, Jianzhong Li
{"title":"Structural-Context Similarities for Uncertain Graphs","authors":"Zhaonian Zou, Jianzhong Li","doi":"10.1109/ICDM.2013.22","DOIUrl":"https://doi.org/10.1109/ICDM.2013.22","url":null,"abstract":"Structural-context similarities between vertices in graphs, such as the Jaccard similarity, the Dice similarity, and the cosine similarity, play important roles in a number of graph data analysis techniques. However, uncertainty is inherent in massive graph data, and therefore the classical definitions of structural-context similarities on exact graphs don't make sense on uncertain graphs. In this paper, we propose a generic definition of structural-context similarity for uncertain graphs. Since it is computationally prohibitive to compute the similarity between two vertices of an uncertain graph directly by its definition, we investigate two efficient approaches to computing similarities, namely the polynomial-time exact algorithms and the linear-time approximation algorithms. The experimental results on real uncertain graphs verify the effectiveness of the proposed structural-context similarities as well as the accuracy and efficiency of the proposed evaluation algorithms.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123851084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Search Behavior Based Latent Semantic User Segmentation for Advertising Targeting 基于搜索行为的潜在语义用户分割在广告定位中的应用
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.62
Xueqing Gong, Xinyu Guo, Rong Zhang, Xiaofeng He, Aoying Zhou
{"title":"Search Behavior Based Latent Semantic User Segmentation for Advertising Targeting","authors":"Xueqing Gong, Xinyu Guo, Rong Zhang, Xiaofeng He, Aoying Zhou","doi":"10.1109/ICDM.2013.62","DOIUrl":"https://doi.org/10.1109/ICDM.2013.62","url":null,"abstract":"The popularity of internet usage greatly motivates the online advertising activities. Compared to advertising on traditional media, online advertising has rich information as well as necessary techniques to achieve precise user targeting. This rich information includes the search behaviors of a user, such as queries issued, or the ads clicked by the user. For popular websites with large number of active users, ad delivery targeting at individual users puts too much burden on the system. User segmentation is an alternative way to relieve this burden by grouping users of similar interests together, then the ad delivery system targets the user segments to display relevant ads, instead of individual users. Existing user segmentation work either adapts clustering methods without considering the hidden semantics embedded in the data, such as K-means, or treats users as data instance and clusters users indirectly even if the latent semantics is incorporated into the transformed data, such as PLSA or LDA. In this paper, we present a search behavior based latent semantic user segmentation method and validate its effectiveness on new ads. Instead of treating users as data instances, they are used as attributes of user issued queries or clicked ads which are considered to be data instances. LDA is then applied to this data set to directly obtain the user segments. Compared to popular K-means clustering, our approach achieves higher CTR values on new ads, with only simple search information.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123818840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
On Anomalous Hotspot Discovery in Graph Streams 图流中的异常热点发现研究
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.32
Weiren Yu, C. Aggarwal, Shuai Ma, Haixun Wang
{"title":"On Anomalous Hotspot Discovery in Graph Streams","authors":"Weiren Yu, C. Aggarwal, Shuai Ma, Haixun Wang","doi":"10.1109/ICDM.2013.32","DOIUrl":"https://doi.org/10.1109/ICDM.2013.32","url":null,"abstract":"Network streams have become ubiquitous in recent years because of many dynamic applications. Such streams may show localized regions of activity and evolution because of anomalous events. This paper will present methods for dynamically determining anomalous hot spots from network streams. These are localized regions of sudden activity or change in the underlying network. We will design a localized principal component analysis algorithm, which can continuously maintain the information about the changes in the different neighborhoods of the network. We will use a fast incremental eigenvector update algorithm based on von Mises iterations in a lazy way in order to efficiently maintain local correlation information. This is used to discover local change hotspots in dynamic streams. We will finally present an experimental study to demonstrate the effectiveness and efficiency of our approach.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123159671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Modeling Temporal Adoptions Using Dynamic Matrix Factorization 使用动态矩阵分解建模时态采用
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.25
Freddy Chongtat Chua, R. J. Oentaryo, Ee-Peng Lim
{"title":"Modeling Temporal Adoptions Using Dynamic Matrix Factorization","authors":"Freddy Chongtat Chua, R. J. Oentaryo, Ee-Peng Lim","doi":"10.1109/ICDM.2013.25","DOIUrl":"https://doi.org/10.1109/ICDM.2013.25","url":null,"abstract":"The problem of recommending items to users is relevant to many applications and the problem has often been solved using methods developed from Collaborative Filtering (CF). Collaborative Filtering model-based methods such as Matrix Factorization have been shown to produce good results for static rating-type data, but have not been applied to time-stamped item adoption data. In this paper, we adopted a Dynamic Matrix Factorization (DMF) technique to derive different temporal factorization models that can predict missing adoptions at different time steps in the users' adoption history. This DMF technique is an extension of the Non-negative Matrix Factorization (NMF) based on the well-known class of models called Linear Dynamical Systems (LDS). By evaluating our proposed models against NMF and TimeSVD++ on two real datasets extracted from ACM Digital Library and DBLP, we show empirically that DMF can predict adoptions more accurately than the NMF for several prediction tasks as well as outperforming TimeSVD++ in some of the prediction tasks. We further illustrate the ability of DMF to discover evolving research interests for a few author examples.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128642091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
Efficiently Mining Top-K High Utility Sequential Patterns 高效挖掘Top-K高效用序列模式
2013 IEEE 13th International Conference on Data Mining Pub Date : 2013-12-01 DOI: 10.1109/ICDM.2013.148
Junfu Yin, Z. Zheng, Longbing Cao, Yin Song, Wei Wei
{"title":"Efficiently Mining Top-K High Utility Sequential Patterns","authors":"Junfu Yin, Z. Zheng, Longbing Cao, Yin Song, Wei Wei","doi":"10.1109/ICDM.2013.148","DOIUrl":"https://doi.org/10.1109/ICDM.2013.148","url":null,"abstract":"High utility sequential pattern mining is an emerging topic in the data mining community. Compared to the classic frequent sequence mining, the utility framework provides more informative and actionable knowledge since the utility of a sequence indicates business value and impact. However, the introduction of \"utility\" makes the problem fundamentally different from the frequency-based pattern mining framework and brings about dramatic challenges. Although the existing high utility sequential pattern mining algorithms can discover all the patterns satisfying a given minimum utility, it is often difficult for users to set a proper minimum utility. A too small value may produce thousands of patterns, whereas a too big one may lead to no findings. In this paper, we propose a novel framework called top-k high utility sequential pattern mining to tackle this critical problem. Accordingly, an efficient algorithm, Top-k high Utility Sequence (TUS for short) mining, is designed to identify top-k high utility sequential patterns without minimum utility. In addition, three effective features are introduced to handle the efficiency problem, including two strategies for raising the threshold and one pruning for filtering unpromising items. Our experiments are conducted on both synthetic and real datasets. The results show that TUS incorporating the efficiency-enhanced strategies demonstrates impressive performance without missing any high utility sequential patterns.","PeriodicalId":308676,"journal":{"name":"2013 IEEE 13th International Conference on Data Mining","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128400731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 96
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信