2009 IEEE International Conference on Data Mining Workshops最新文献

筛选
英文 中文
Multi-sphere Support Vector Data Description for Outliers Detection on Multi-distribution Data 多分布数据异常点检测的多球面支持向量数据描述
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.87
Yanshan Xiao, Bo Liu, Longbing Cao, Xindong Wu, Chengqi Zhang, Z. Hao, Fengzhao Yang, Jie Cao
{"title":"Multi-sphere Support Vector Data Description for Outliers Detection on Multi-distribution Data","authors":"Yanshan Xiao, Bo Liu, Longbing Cao, Xindong Wu, Chengqi Zhang, Z. Hao, Fengzhao Yang, Jie Cao","doi":"10.1109/ICDMW.2009.87","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.87","url":null,"abstract":"SVDD has been proved a powerful tool for outlier detection. However, in detecting outliers on multi-distribution data, namely there are distinctive distributions in the data, it is very challenging for SVDD to generate a hyper-sphere for distinguishing outliers from normal data. Even if such a hyper-sphere can be identified, its performance is usually not good enough. This paper proposes an multi-sphere SVDD approach, named MS-SVDD, for outlier detection on multi-distribution data. First, an adaptive sphere detection method is proposed to detect data distributions in the dataset. The data is partitioned in terms of the identified data distributions, and the corresponding SVDD classifiers are constructed separately. Substantial experiments on both artificial and real-world datasets have demonstrated that the proposed approach outperforms original SVDD.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121834786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
GLSVM: Integrating Structured Feature Selection and Large Margin Classification GLSVM:结合结构化特征选择和大边际分类
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.39
Hongliang Fei, Brian Quanz, Jun Huan
{"title":"GLSVM: Integrating Structured Feature Selection and Large Margin Classification","authors":"Hongliang Fei, Brian Quanz, Jun Huan","doi":"10.1109/ICDMW.2009.39","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.39","url":null,"abstract":"High dimensional data challenges current feature selection methods. For many real world problems we often have prior knowledge about the relationship of features. For example in microarray data analysis, genes from the same biological pathways are expected to have similar relationship to the outcome that we target to predict. Recent regularization methods on Support Vector Machine (SVM) have achieved great success to perform feature selection and model selection simultaneously for high dimensional data, but neglect such relationship among features. To build interpretable SVM models, the structure information of features should be incorporated. In this paper, we propose an algorithm GLSVM that automatically perform model selection and feature selection in SVMs. To incorporate the prior knowledge of feature relationship, we extend standard 2 norm SVM and use a penalty function that employs a L2 norm regularization term including the normalized Laplacian of the graph and L1 penalty. We have demonstrated the effectiveness of our methods and compare them to the state-of-the-art using two real-world benchmarks.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116901376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Practical Differentially Private Random Decision Tree Classifier 一种实用的差分私有随机决策树分类器
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.93
G. Jagannathan, Krishnan Pillaipakkamnatt, R. Wright
{"title":"A Practical Differentially Private Random Decision Tree Classifier","authors":"G. Jagannathan, Krishnan Pillaipakkamnatt, R. Wright","doi":"10.1109/ICDMW.2009.93","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.93","url":null,"abstract":"In this paper, we study the problem of constructing private classifiers using decision trees, within the framework of differential privacy. We first construct privacy-preserving ID3 decision trees using differentially private sum queries. Our experiments show that for many data sets a reasonable privacy guarantee can only be obtained via this method at a steep cost of accuracy in predictions. We then present a differentially private decision tree ensemble algorithm using the random decision tree approach. We demonstrate experimentally that our approach yields good prediction accuracy even when the size of the datasets is small. We also present a differentially private algorithm for the situation in which new data is periodically appended to an existing database. Our experiments show that our differentially private random decision tree classifier handles data updates in a way that maintains the same level of privacy guarantee.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128400663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 208
Uncertainty Quantification in the Presence of Limited Climate Model Data with Discontinuities 有间断的有限气候模式数据存在下的不确定性量化
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.111
K. Sargsyan, C. Safta, B. Debusschere, H. Najm
{"title":"Uncertainty Quantification in the Presence of Limited Climate Model Data with Discontinuities","authors":"K. Sargsyan, C. Safta, B. Debusschere, H. Najm","doi":"10.1109/ICDMW.2009.111","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.111","url":null,"abstract":"Uncertainty quantification in climate models is challenged by the sparsity of the available climate data due to the high computational cost of the model runs. Another feature that prevents classical uncertainty analyses from being easily applicable is the bifurcative behavior in the climate data with respect to certain parameters. A typical example is the Meridional Overturning Circulation in the Atlantic Ocean. The maximum overturning stream function exhibits discontinuity across a curve in the space of two uncertain parameters, namely climate sensitivity and CO2 forcing. We develop a methodology that performs uncertainty quantification in this context in the presence of limited data.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121949707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Mining Personal Image Collection for Social Group Suggestion 面向社会群体建议的个人形象挖掘
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.77
Jie Yu, Xin Jin, Jiawei Han, Jiebo Luo
{"title":"Mining Personal Image Collection for Social Group Suggestion","authors":"Jie Yu, Xin Jin, Jiawei Han, Jiebo Luo","doi":"10.1109/ICDMW.2009.77","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.77","url":null,"abstract":"Popular photo-sharing sites have attracted millions of people and helped construct massive social networks in cyberspace. Different from traditional social relationship, users actively interact within groups where common interests are shared on certain types of events or topics captured by photos and videos. Contributing images to a group would greatly promote the interactions between users and expand their social networks. In this work, we intend to produce accurate predictions of suitable photo-sharing groups from a user's images by mining images both on the Web and in the user’s personal collection. To this end, we designed a new approach to cluster popular groups into categories by analyzing the similarity of groups via SimRank. Both visual content and its annotations are integrated to understand the events or topics depicted in the images. Experiments on real user images demonstrate the feasibility of the proposed approach.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"357 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115939530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Pattern Mining over Star Schemas in the Onto4AR Framework Onto4AR框架中星型模式的模式挖掘
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.68
C. Antunes
{"title":"Pattern Mining over Star Schemas in the Onto4AR Framework","authors":"C. Antunes","doi":"10.1109/ICDMW.2009.68","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.68","url":null,"abstract":"Storing data according to the multidimensional model, in particular following star schemas, has demonstrated to be one of the most adequate forms to ease the exploration of data. However, this exploration has been limited to be query-based, leaving the discovery of hidden information to a second plan. The main reason for this, relates to the inability of traditional mining techniques to deal with several data tables at the same time. In this paper, we propose a new approach to mine patterns among data stored as a star schema, based in a domain driven framework, where available knowledge is represented in a domain ontology. Pattern mining is performed by an apriori-based algorithm - the D2Apriori, but more efficient algorithms are being implemented and tested, in order to solve performance issues related with the large amount of data stored in data warehouses.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124340303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Sparse Least-Squares Methods in the Parallel Machine Learning (PML) Framework 并行机器学习框架中的稀疏最小二乘方法
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.106
R. Natarajan, Vikas Sindhwani, S. Tatikonda
{"title":"Sparse Least-Squares Methods in the Parallel Machine Learning (PML) Framework","authors":"R. Natarajan, Vikas Sindhwani, S. Tatikonda","doi":"10.1109/ICDMW.2009.106","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.106","url":null,"abstract":"We describe parallel methods for solving large-scale, high-dimensional, sparse least-squares problems that arise in machine learning applications such as document classification. The basic idea is to solve a two-class response problem using a fast regression technique based on minimizing a loss function, which consists of an empirical squared-error term, and one or more regularization terms. We consider the use of Lenclos-based methods for solving these regularized least-squares problems, with the parallel implementation in the Parallel MachineLearning (PML) framework, and performance results on the IBM Blue Gene/P parallel computer.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116163163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Greedy Optimization for Contiguity-Constrained Hierarchical Clustering 邻近约束层次聚类的贪婪优化
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.75
Diansheng Guo
{"title":"Greedy Optimization for Contiguity-Constrained Hierarchical Clustering","authors":"Diansheng Guo","doi":"10.1109/ICDMW.2009.75","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.75","url":null,"abstract":"The discovery and construction of inherent regions in large spatial datasets is an important task for many research domains such as climate zoning, eco-region analysis, public health mapping, and political redistricting. From the perspective of cluster analysis, it requires that each cluster is geographically contiguous. This paper presents a contiguity constrained hierarchical clustering and optimization method that can partition a set of spatial objects into a hierarchy of contiguous regions while optimizing an objective function. The method consists of two steps: contiguity constrained hierarchical clustering and two-way fine-tuning. The above two steps are repeated to create a hierarchy of regions. Evaluations and comparison show that the proposed method consistently and significantly outperforms existing methods by a large margin in terms of optimizing the objective function. Moreover, the method is flexible to accommodate different objective functions and additional constraints (such as the minimum size of each region), which are useful to for various application domains.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125750911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Knowledge Transfer among Heterogeneous Information Networks 异构信息网络中的知识转移
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.100
E. Xiang, N. Liu, Sinno Jialin Pan, Qiang Yang
{"title":"Knowledge Transfer among Heterogeneous Information Networks","authors":"E. Xiang, N. Liu, Sinno Jialin Pan, Qiang Yang","doi":"10.1109/ICDMW.2009.100","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.100","url":null,"abstract":"Online recommendation systems are becoming more and more popular with the development of web. However, a critical problem of such system is that new users and items are always added to the system with time. How to overcome the data sparseness for such new incoming entities become an important issue. In this paper, we try to reduce the data sparseness in the link prediction problem via involving heterogeneous information network as auxiliary information sources. We developed two models based on the Collective Matrix Factorization (CMF) framework. We also provided a detailed empirical study on how effectively different information networks could help with two real world link prediction tasks. We will report some preliminary results of our current work and also point our several potential research issues.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129503059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Understanding Climate Change Patterns with Multivariate Geovisualization 用多元地理可视化理解气候变化模式
2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI: 10.1109/ICDMW.2009.91
Hai Jin, Diansheng Guo
{"title":"Understanding Climate Change Patterns with Multivariate Geovisualization","authors":"Hai Jin, Diansheng Guo","doi":"10.1109/ICDMW.2009.91","DOIUrl":"https://doi.org/10.1109/ICDMW.2009.91","url":null,"abstract":"Climate change has been a challenging and urgent research problem for many related research fields. Climate change trends and patterns are complex, which may involve many factors and vary across space and time. However, most existing visualization and mapping approaches for climate data analysis are limited to one variable or one perspective at a time. For example, it is common to map the surface temperature anomaly at different locations or plot trends of time series. Although such approaches are useful in presenting information and knowledge, they have limited capability to support discovery and understanding of unknown complex patterns from data that span across multiple dimensions. This paper introduces the application of a multivariate geovisualization approach to explore and understand complex climate change patterns across multiple perspectives, including the geographic space, time, and multiple variables.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129136412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信