Sixth International Conference on Data Mining (ICDM'06)最新文献_第5页

Rule-Based Platform for Web User Profiling 基于规则的Web用户分析平台

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.137

Jianping Zhang, Manu Shukla

引用次数: 10

Improving Grouped-Entity Resolution Using Quasi-Cliques 利用准派系改进群实体解析

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.85

Byung-Won On, Ergin Elmacioglu, Dongwon Lee, Jaewoo Kang, J. Pei

引用次数: 60

Regularized Least Absolute Deviations Regression and an Efficient Algorithm for Parameter Tuning 正则化最小绝对偏差回归及参数整定的有效算法

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.134

Li Wang, Michael D. Gordon, Ji Zhu

{"title":"Regularized Least Absolute Deviations Regression and an Efficient Algorithm for Parameter Tuning","authors":"Li Wang, Michael D. Gordon, Ji Zhu","doi":"10.1109/ICDM.2006.134","DOIUrl":"https://doi.org/10.1109/ICDM.2006.134","url":null,"abstract":"Linear regression is one of the most important and widely used techniques for data analysis. However, sometimes people are not satisfied with it because of the following two limitations: 1) its results are sensitive to outliers, so when the error terms are not normally distributed, especially when they have heavy-tailed distributions, linear regression often works badly; 2) its estimated coefficients tend to have high variance, although their bias is low. To reduce the influence of outliers, robust regression models were developed. Least absolute deviation (LAD) regression is one of them. LAD minimizes the mean absolute errors, instead of mean squared errors, so its results are more robust. To address the second limitation, shrinkage methods were proposed, which add a penalty on the size of the coefficients. The LASSO is one of these methods and it uses the L1-norm penalty, which not only reduces the prediction error and the variance of estimated coefficients, but also provides an automatic feature selection function. In this paper, we propose the regularized least absolute deviation (RLAD) regression model, which combines the nice features of the LAD and the LASSO together. The RLAD is a regularization method, whose objective function has the form of \"loss + penalty.\" The \"loss\" is the sum of the absolute deviations and the \"penalty\" is the L1-norm of the coefficient vector. Furthermore, to facilitate parameter tuning, we develop an efficient algorithm which can solve the entire regularization path in one pass. Simulations with various settings are performed to demonstrate its performance. Finally, we apply the algorithm to solve the image reconstruction problem and find interesting results.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127071951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 95

Data Mining Approaches to Criminal Career Analysis 犯罪生涯分析的数据挖掘方法

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.47

J. D. Bruin, Tim K. Cocx, W. Kosters, J. Laros, J. Kok

{"title":"Data Mining Approaches to Criminal Career Analysis","authors":"J. D. Bruin, Tim K. Cocx, W. Kosters, J. Laros, J. Kok","doi":"10.1109/ICDM.2006.47","DOIUrl":"https://doi.org/10.1109/ICDM.2006.47","url":null,"abstract":"Narrative reports and criminal records are stored digitally across individual police departments, enabling the collection of this data to compile a nation-wide database of criminals and the crimes they committed. The compilation of this data through the last years presents new possibilities of analyzing criminal activity through time. Augmenting the traditional, more socially oriented, approach of behavioral study of these criminals and traditional statistics, data mining methods like clustering and prediction enable police forces to get a clearer picture of criminal careers. This allows officers to recognize crucial spots in changing criminal behaviour and deploy resources to prevent these careers from unfolding. Four important factors play a role in the analysis of criminal careers: crime nature, frequency, duration and severity. We describe a tool that extracts these from the database and creates digital profiles for all offenders. It compares all individuals on these profiles by a new distance measure and clusters them accordingly. This method yields a visual clustering of these criminal careers and enables the identification of classes of criminals. The proposed method allows for several user-defined parameters.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125147947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 110

Automatic Single-Organ Segmentation in Computed Tomography Images 计算机断层扫描图像中单器官自动分割

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.24

Ruchaneewan Susomboon, D. Raicu, J. Furst, D. Channin

引用次数: 20

The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering 聚类中各种非负矩阵分解方法之间的关系

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.160

Tao Li, C. Ding

引用次数: 308

Adaptive Parallel Graph Mining for CMP Architectures 面向CMP架构的自适应并行图挖掘

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.15

G. Buehrer, S. Parthasarathy, Yen-kuang Chen

{"title":"Adaptive Parallel Graph Mining for CMP Architectures","authors":"G. Buehrer, S. Parthasarathy, Yen-kuang Chen","doi":"10.1109/ICDM.2006.15","DOIUrl":"https://doi.org/10.1109/ICDM.2006.15","url":null,"abstract":"Mining graph data is an increasingly popular challenge, which has practical applications in many areas, including molecular substructure discovery, Web link analysis, fraud detection, and social network analysis. The problem statement is to enumerate all subgraphs occurring in at least sigma graphs of a database, where sigma is a user specified parameter. Chip multiprocessors (CMPs) provide true parallel processing, and are expected to become the de facto standard for commodity computing. In this work, building on the state-of-the-art, we propose an efficient approach to parallelize such algorithms for CMPs. We show that an algorithm which adapts its behavior based on the runtime state of the system can improve system utilization and lower execution times. Most notably, we incorporate dynamic state management to allow memory consumption to vary based on availability. We evaluate our techniques on current day shared memory systems (SMPs) and expect similar performance for CMPs. We demonstrate excellent speedup, 27-fold on 32 processors for several real world datasets. Additionally, we show our dynamic techniques afford this scalability while consuming up to 35% less memory than static techniques.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130433070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 59

Detecting Link Spam Using Temporal Information 利用时间信息检测垃圾链接

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.51

Guoyang Shen, Bin Gao, Tie-Yan Liu, Guang Feng, Shiji Song, Hang Li

引用次数: 62

Boosting for Learning Multiple Classes with Imbalanced Class Distribution 班级分布不均衡的多门课学习助推

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.29

Yanmin Sun, M. Kamel, Yang Wang

引用次数: 292

Personalization in Context: Does Context Matter When Building Personalized Customer Models? 情境中的个性化:构建个性化客户模型时情境是否重要?

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.125

M. Gorgoglione, C. Palmisano, A. Tuzhilin

引用次数: 35