Fourth IEEE International Conference on Data Mining (ICDM'04)最新文献

筛选
英文 中文
Subspace selection for clustering high-dimensional data 高维数据聚类的子空间选择
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10112
C. Baumgartner, C. Plant, K. Murthy, H. Kriegel, Peer Kröger
{"title":"Subspace selection for clustering high-dimensional data","authors":"C. Baumgartner, C. Plant, K. Murthy, H. Kriegel, Peer Kröger","doi":"10.1109/ICDM.2004.10112","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10112","url":null,"abstract":"In high-dimensional feature spaces traditional clustering algorithms tend to break down in terms of efficiency and quality. Nevertheless, the data sets often contain clusters which are hidden in various subspaces of the original feature space. In this paper, we present a feature selection technique called SURFING (subspaces relevant for clustering) that finds all subspaces interesting for clustering and sorts them by relevance. The sorting is based on a quality criterion for the interestingness of a subspace using the k-nearest neighbor distances of the objects. As our method is more or less parameterless, it addresses the unsupervised notion of the data mining task \"clustering\" in a best possible way. A broad evaluation based on synthetic and real-world data sets demonstrates that SURFING is suitable to find all relevant sub-spaces in high dimensional, sparse data sets and produces better results than comparative methods.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127846438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 77
Privacy-preserving outlier detection 保护隐私的异常值检测
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10081
Jaideep Vaidya, Chris Clifton
{"title":"Privacy-preserving outlier detection","authors":"Jaideep Vaidya, Chris Clifton","doi":"10.1109/ICDM.2004.10081","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10081","url":null,"abstract":"Outlier detection can lead to the discovery of truly unexpected knowledge in many areas such as electronic commerce, credit card fraud and especially national security. We look at the problem of finding outliers in large distributed databases where privacy/security concerns restrict the sharing of data. Both homogeneous and heterogeneous distribution of data is considered. We propose techniques to detect outliers in such scenarios while giving formal guarantees on the amount of information disclosed.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115580908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
A biobjective model to select features with good classification quality and low cost 一种选择分类质量好、成本低的特征的双目标模型
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10042
E. Carrizosa, B. Martín-Barragán, D. Morales
{"title":"A biobjective model to select features with good classification quality and low cost","authors":"E. Carrizosa, B. Martín-Barragán, D. Morales","doi":"10.1109/ICDM.2004.10042","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10042","url":null,"abstract":"In this paper we address a multigroup classification problem in which we want to take into account, together with the generalization ability, costs associated with the features. This cost is not limited to an economical payment, but can also refer to risk, computational effort, space requirements, etc. In order to get a good generalization ability, we use support vector machines (SVM) as the basic mechanism by considering the maximization of the margin. We formulate the problem as a biobjective mixed integer problem, for which Pareto optimal solutions can be obtained.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117292228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic daily-living patterns and association analyses in tele-care systems 远程护理系统中的动态日常生活模式和关联分析
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10023
Beng-Seuk Lee, T. Martin, Nick P. Clarke, B. Majeed, D. Nauck
{"title":"Dynamic daily-living patterns and association analyses in tele-care systems","authors":"Beng-Seuk Lee, T. Martin, Nick P. Clarke, B. Majeed, D. Nauck","doi":"10.1109/ICDM.2004.10023","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10023","url":null,"abstract":"Tele-care systems aim to carry out intelligent analyses of a person's wellbeing using data about their daily activities. This is a very challenging task because the massive dataset is likely to be erroneous, possibly with misleading sections due to noise or missing values. Furthermore, the interpretation of the data is highly sensitive to the lifestyle of the monitored person and the environment in which they interact. In our tele-care project, sensor-network domain knowledge is used to overcome the difficulties of monitoring long-term wellbeing with an imperfect data source. In addition, a fuzzy association analysis is leveraged to implement a dynamic and flexible analysis over individual- and environment-dependent data.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127881782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Orthogonal decision trees 正交决策树
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10072
H. Kargupta, Haimonti Dutta
{"title":"Orthogonal decision trees","authors":"H. Kargupta, Haimonti Dutta","doi":"10.1109/ICDM.2004.10072","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10072","url":null,"abstract":"This paper introduces orthogonal decision trees that offer an effective way to construct a redundancy-free, accurate, and meaningful representation of large decision-tree-ensembles often created by popular techniques such as bagging, boosting, random forests and many distributed and data stream mining algorithms. Orthogonal decision trees are functionally orthogonal to each other and they correspond to the principal components of the underlying function space. This paper offers a technique to construct such trees based on eigen-analysis of the ensemble and offers experimental results to document the performance of orthogonal trees on grounds of accuracy and model complexity.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126640148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Communication efficient construction of decision trees over heterogeneously distributed data 异构分布数据下决策树的高效通信构造
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10114
C. Giannella, Kun Liu, Todd Olsen, H. Kargupta
{"title":"Communication efficient construction of decision trees over heterogeneously distributed data","authors":"C. Giannella, Kun Liu, Todd Olsen, H. Kargupta","doi":"10.1109/ICDM.2004.10114","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10114","url":null,"abstract":"We present an algorithm designed to efficiently construct a decision tree over heterogeneously distributed data without centralizing. We compare our algorithm against a standard centralized decision tree implementation in terms of accuracy as well as the communication complexity. Our experimental results show that by using only 20% of the communication cost necessary to centralize the data we can achieve trees with accuracy at least 80% of the trees produced by the centralized version.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"252 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115611923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Extensible Markov model 可扩展马尔可夫模型
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10067
M. Dunham, Yu Meng, Jie Huang
{"title":"Extensible Markov model","authors":"M. Dunham, Yu Meng, Jie Huang","doi":"10.1109/ICDM.2004.10067","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10067","url":null,"abstract":"A Markov chain is a popular data modeling tool. This paper presents a variation of Markov chain, namely extensible Markov model (EMM). By providing a dynamically adjustable structure, EMM overcomes the problems caused by the static nature of the traditional Markov chain. Therefore, EMMs are particularly well suited to model spatiotemporal data such as network traffic, environmental data, weather data, and automobile traffic. Performance studies using EMMs for spatiotemporal prediction problems show the advantages of this approach.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123846272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Mass spectrum labeling: theory and practice 质谱标记:理论与实践
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10080
Zheng Huang, Lei Chen, Jin-Yi Cai, D. S. Gross, D. Musicant, R. Ramakrishnan, J. Schauer, Stephen J. Wright
{"title":"Mass spectrum labeling: theory and practice","authors":"Zheng Huang, Lei Chen, Jin-Yi Cai, D. S. Gross, D. Musicant, R. Ramakrishnan, J. Schauer, Stephen J. Wright","doi":"10.1109/ICDM.2004.10080","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10080","url":null,"abstract":"We introduce the problem of labeling a particle's mass spectrum with the substances it contains, and develop several formal representations of the problem, taking into account practical complications such as unknown compounds and noise. This task is currently a bottle-neck in analyzing data from a new generation of instruments for real-time environmental monitoring.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114551473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Mining associations by linear inequalities 利用线性不等式挖掘关联
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10098
T. Lin
{"title":"Mining associations by linear inequalities","authors":"T. Lin","doi":"10.1109/ICDM.2004.10098","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10098","url":null,"abstract":"The main theorem is: generalized associations of a relational table can be found by a finite set of linear inequalities within polynomial time. It is derived from the following three results, which were established in ICDMO'02 and are re-developed here. They are: (1) isomorphic theorem: isomorphic relations have isomorphic patterns. Such an isomorphism classifies relational tables into isomorphic classes. (2) A variant of the classical bitmaps indexes uniquely exists in each isomorphic class. We take it as the canonical model of the class. (3) All possible attributes/features can be generated by a generalized procedure of the classical AOG (attribute oriented generalization). Then, (4) the main theorem for canonical model is established. By isomorphism theorem, we had the final result (5).","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115491943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Mining Web data to create online navigation recommendations 挖掘Web数据以创建在线导航建议
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10019
J. D. Velásquez, Alejandro Bassi, H. Yasuda, T. Aoki
{"title":"Mining Web data to create online navigation recommendations","authors":"J. D. Velásquez, Alejandro Bassi, H. Yasuda, T. Aoki","doi":"10.1109/ICDM.2004.10019","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10019","url":null,"abstract":"A system to provide online navigation recommendation for Web visitors is introduced. We call visitor the anonymous user, i.e., when only data about her/his browsing behavior (Web logs) are available. We first apply clustering techniques over a large sample of Web data. Next, from the significant patterns that are discovered, a set of rules about how to use them is created. Finally, comparing the current Web visitor session with the patterns, online navigation recommendations are proposed using the mentioned rules. The system was tested using data from a real Web site, showing its effectiveness.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125670804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信