Sixth International Conference on Data Mining (ICDM'06)最新文献_第2页

Boosting the Feature Space: Text Classification for Unstructured Data on the Web 增强特征空间:Web上非结构化数据的文本分类

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.31

Yang Song, Ding Zhou, Jian Huang, Isaac G. Councill, H. Zha, C. Lee Giles

引用次数: 17

Stability Region Based Expectation Maximization for Model-based Clustering 基于稳定域的模型聚类期望最大化

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.152

C. Reddy, H. Chiang, B. Rajaratnam

{"title":"Stability Region Based Expectation Maximization for Model-based Clustering","authors":"C. Reddy, H. Chiang, B. Rajaratnam","doi":"10.1109/ICDM.2006.152","DOIUrl":"https://doi.org/10.1109/ICDM.2006.152","url":null,"abstract":"In spite of the initialization problem, the expectation-maximization (EM) algorithm is widely used for estimating the parameters in several data mining related tasks. Most popular model-based clustering techniques might yield poor clusters if the parameters are not initialized properly. To reduce the sensitivity of initial points, a novel algorithm for learning mixture models from multivariate data is introduced in this paper. The proposed algorithm takes advantage of TRUST-TECH (TRansformation Under STability- reTaining Equilibra CHaracterization) to compute neighborhood local maxima on likelihood surface using stability regions. Basically, our method coalesces the advantages of the traditional EM with that of the dynamic and geometric characteristics of the stability regions of the corresponding nonlinear dynamical system of the log-likelihood function. Two phases namely, the EM phase and the stability region phase, are repeated alternatively in the parameter space to achieve improvements in the maximum likelihood. Though applied to Gaussian mixtures in this paper, our technique can be easily generalized to any other parametric finite mixture model. The algorithm has been tested on both synthetic and real datasets and the improvements in the performance compared to other approaches are demonstrated. The robustness with respect to initialization is also illustrated experimentally.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115766098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Fast Relevance Discovery in Time Series 时间序列中的快速相关性发现

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.71

Chang-Shing Perng, Haixun Wang, Sheng Ma

引用次数: 4

Resource Management for Networked Classifiers in Distributed Stream Mining Systems 分布式流挖掘系统中网络分类器的资源管理

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.136

D. Turaga, O. Verscheure, U. Chaudhari, Lisa Amini

引用次数: 23

Anytime Classification Using the Nearest Neighbor Algorithm with Applications to Stream Mining 基于最近邻算法的任意时间分类及其在流挖掘中的应用

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.21

Ken Ueno, X. Xi, Eamonn J. Keogh, Dah-Jye Lee

引用次数: 115

TOP-COP: Mining TOP-K Strongly Correlated Pairs in Large Databases TOP-COP:大型数据库中TOP-K强相关对的挖掘

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.161

Hui Xiong, Mark Brodie, Sheng Ma

引用次数: 33

Using an Ensemble of One-Class SVM Classifiers to Harden Payload-based Anomaly Detection Systems 基于单类支持向量机分类器集成的有效载荷异常检测系统

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.165

R. Perdisci, G. Gu, Wenke Lee

引用次数: 243

GraphRank: Statistical Modeling and Mining of Significant Subgraphs in the Feature Space GraphRank:特征空间中显著子图的统计建模和挖掘

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.79

Huahai He, Ambuj K. Singh

{"title":"GraphRank: Statistical Modeling and Mining of Significant Subgraphs in the Feature Space","authors":"Huahai He, Ambuj K. Singh","doi":"10.1109/ICDM.2006.79","DOIUrl":"https://doi.org/10.1109/ICDM.2006.79","url":null,"abstract":"We propose a technique for evaluating the statistical significance of frequent subgraphs in a database. A graph is represented by a feature vector that is a histogram over a set of basis elements. The set of basis elements is chosen based on domain knowledge and consists generally of vertices, edges, or small graphs. A given subgraph is transformed to a feature vector and the significance of the subgraph is computed by considering the significance of occurrence of the corresponding vector. The probability of occurrence of the vector in a random vector is computed based on the prior probability of the basis elements. This is then used to obtain a probability distribution on the support of the vector in a database of random vectors. The statistical significance of the vector/subgraph is then defined as the p-value of its observed support. We develop efficient methods for computing p-values and lower bounds. A simplified model is further proposed to improve the efficiency. We also address the problem of feature vector mining, a generalization of item- set mining where counts are associated with items and the goal is to find significant sub-vectors. We present an algorithm that explores closed frequent sub-vectors to find significant ones. Experimental results show that the proposed techniques are effective, efficient, and useful for ranking frequent subgraphs by their statistical significance.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121542322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 58

Temporal Data Mining in Dynamic Feature Spaces 动态特征空间中的时态数据挖掘

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.157

B. Wenerstrom, C. Giraud-Carrier

引用次数: 43

Speedup Clustering with Hierarchical Ranking 用层次排序加速聚类

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.151

Jianjun Zhou, J. Sander

引用次数: 4