Fourth IEEE International Conference on Data Mining (ICDM'04)最新文献

筛选
英文 中文
Discovery of functional relationships in multi-relational data using inductive logic programming 用归纳逻辑编程发现多关系数据中的函数关系
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10053
Alexessander Alves, Rui Camacho, Eugénio C. Oliveira
{"title":"Discovery of functional relationships in multi-relational data using inductive logic programming","authors":"Alexessander Alves, Rui Camacho, Eugénio C. Oliveira","doi":"10.1109/ICDM.2004.10053","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10053","url":null,"abstract":"ILP systems have been largely applied to data mining classification tasks with a considerable success. The use of ILP systems in regression tasks has been far less successful. Current systems have very limited numerical reasoning capabilities, which limits the application of ILP to discovery of functional relationships of numeric nature. This paper proposes improvements in numerical reasoning capabilities of ILP systems for dealing with regression tasks. It proposes the use of statistical-based techniques like model validation and model selection to improve noise handling and it introduces a search stopping criterium based on the PAC method to evaluate learning performance. We have found these extensions essential to improve on results over machine learning and statistical-based algorithms used in the empirical evaluation study.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121794917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Active feature-value acquisition for classifier induction 用于分类器归纳的主动特征值获取
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10075
Prem Melville, M. Saar-Tsechansky, F. Provost, R. Mooney
{"title":"Active feature-value acquisition for classifier induction","authors":"Prem Melville, M. Saar-Tsechansky, F. Provost, R. Mooney","doi":"10.1109/ICDM.2004.10075","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10075","url":null,"abstract":"Many induction problems include missing data that can be acquired at a cost. For building accurate predictive models, acquiring complete information for all instances is often expensive or unnecessary, while acquiring information for a random subset of instances may not be most effective. Active feature-value acquisition tries to reduce the cost of achieving a desired model accuracy by identifying instances for which obtaining complete information is most informative. We present an approach in which instances are selected for acquisition based on the current model's accuracy and its confidence in the prediction. Experimental results demonstrate that our approach can induce accurate models using substantially fewer feature-value acquisitions as compared to alternative policies.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126158581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 105
Integrating multi-objective genetic algorithms into clustering for fuzzy association rules mining 将多目标遗传算法集成到聚类中进行模糊关联规则挖掘
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10050
Mehmet Kaya, R. Alhajj
{"title":"Integrating multi-objective genetic algorithms into clustering for fuzzy association rules mining","authors":"Mehmet Kaya, R. Alhajj","doi":"10.1109/ICDM.2004.10050","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10050","url":null,"abstract":"In this paper, we propose an automated method to decide on the number of fuzzy sets and for the autonomous mining of both fuzzy sets and fuzzy association rules. We compare the proposed multiobjective GA based approach with: 1) CURE based approach; 2) Chien et al. (2001) clustering approach. Experimental results on JOOK transactions extracted from the adult data of United States census in year 2000 show that the proposed method exhibits good performance over the other two approaches in terms of runtime, number of large itemsets and number of association rules.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125344249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
On closed constrained frequent pattern mining 封闭约束频繁模式挖掘
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10093
F. Bonchi, C. Lucchese
{"title":"On closed constrained frequent pattern mining","authors":"F. Bonchi, C. Lucchese","doi":"10.1109/ICDM.2004.10093","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10093","url":null,"abstract":"Constrained frequent patterns and closed frequent patterns are two paradigms aimed at reducing the set of extracted patterns to a smaller, more interesting, subset. Although a lot of work has been done with both these paradigms, there is still confusion around the mining problem obtained by joining closed and constrained frequent patterns in a unique framework. In this paper, we shed light on this problem by providing a formal definition and a thorough characterization. We also study computational issues and show how to combine the most recent results in both paradigms, providing a very efficient algorithm which exploits the two requirements (satisfying constraints and being closed) together at mining time in order to reduce the computation as much as possible.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126628644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 107
Multi-view clustering 多视点集群
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10095
S. Bickel, T. Scheffer
{"title":"Multi-view clustering","authors":"S. Bickel, T. Scheffer","doi":"10.1109/ICDM.2004.10095","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10095","url":null,"abstract":"We consider clustering problems in which the available attributes can be split into two independent subsets, such that either subset suffices for learning. Example applications of this multi-view setting include clustering of Web pages which have an intrinsic view (the pages themselves) and an extrinsic view (e.g., anchor texts of inbound hyperlinks); multi-view learning has so far been studied in the context of classification. We develop and study partitioning and agglomerative, hierarchical multi-view clustering algorithms for text data. We find empirically that the multi-view versions of k-means and EM greatly improve on their single-view counterparts. By contrast, we obtain negative results for agglomerative hierarchical multi-view clustering. Our analysis explains this surprising phenomenon.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121564914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 763
Filling-in missing objects in orders 按顺序填充缺失的对象
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10047
Toshihiro Kamishima, S. Akaho
{"title":"Filling-in missing objects in orders","authors":"Toshihiro Kamishima, S. Akaho","doi":"10.1109/ICDM.2004.10047","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10047","url":null,"abstract":"Filling-in techniques are important, since missing values frequently appear in real data. Such techniques have been established for categorical or numerical values. Though lists of ordered objects are widely used as representational forms (e.g., Web search results, best-seller lists), filling-in techniques for orders have received little attention. We therefore propose a simple but effective technique to fill-in missing objects in orders. We built this technique into our collaborative filtering system.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"172 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124188676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Decision tree evolution using limited number of labeled data items from drifting data streams 使用漂移数据流中有限数量的标记数据项进行决策树演化
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10026
W. Fan, Yi-an Huang, Philip S. Yu
{"title":"Decision tree evolution using limited number of labeled data items from drifting data streams","authors":"W. Fan, Yi-an Huang, Philip S. Yu","doi":"10.1109/ICDM.2004.10026","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10026","url":null,"abstract":"Most previously proposed mining methods on data streams make an unrealistic assumption that \"labelled\" data stream is readily available and can be mined at anytime. However, in most real-world problems, labelled data streams are rarely immediately available. Due to this reason, models are reconstructed only when labelled data become available periodically. This passive stream mining model has several drawbacks. We propose a concept of demand-driven active data mining. In active mining, the loss of the model is either continuously guessed without using any true class labels or estimated, whenever necessary, from a small number of instances whose actual class labels are verified by paying an affordable cost. When the estimated loss is more than a tolerable threshold, the model evolves by using a small number of instances with verified true class labels. Previous work on active mining concentrates on error guess and estimation. In this paper, we discuss several approaches on decision tree evolution.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128070575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Density connected clustering with local subspace preferences 具有局部子空间偏好的密度连通聚类
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10087
C. Böhm, K. Murthy, H. Kriegel, Peer Kröger
{"title":"Density connected clustering with local subspace preferences","authors":"C. Böhm, K. Murthy, H. Kriegel, Peer Kröger","doi":"10.1109/ICDM.2004.10087","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10087","url":null,"abstract":"Many clustering algorithms tend to break down in high-dimensional feature spaces, because the clusters often exist only in specific subspaces (attribute subsets) of the original feature space. Therefore, the task of projected clustering (or subspace clustering) has been defined recently. As a solution to tackle this problem, we propose the concept of local subspace preferences, which captures the main directions of high point density. Using this concept, we adopt density-based clustering to cope with high-dimensional data. In particular, we achieve the following advantages over existing approaches: Our proposed method has a determinate result, does not depend on the order of processing, is robust against noise, performs only one single scan over the database, and is linear in the number of dimensions. A broad experimental evaluation shows that our approach yields results of significantly better quality than recent work on clustering high-dimensional data.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128145868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 188
Learning weighted naive Bayes with accurate ranking 学习具有精确排序的加权朴素贝叶斯
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10030
Harry Zhang, Shengli Sheng
{"title":"Learning weighted naive Bayes with accurate ranking","authors":"Harry Zhang, Shengli Sheng","doi":"10.1109/ICDM.2004.10030","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10030","url":null,"abstract":"Naive Bayes is one of most effective classification algorithms. In many applications, however, a ranking of examples are more desirable than just classification. How to extend naive Bayes to improve its ranking performance is an interesting and useful question in practice. Weighted naive Bayes is an extension of naive Bayes, in which attributes have different weights. This paper investigates how to learn a weighted naive Bayes with accurate ranking from data, or more precisely, how to learn the weights of a weighted naive Bayes to produce accurate ranking. We explore various methods: the gain ratio method, the hill climbing method, and the Markov chain Monte Carlo method, the hill climbing method combined with the gain ratio method, and the Markov chain Monte Carlo method combined with the gain ratio method. Our experiments show that a weighted naive Bayes trained to produce accurate ranking outperforms naive Bayes.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127942218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 163
Divide and prosper: comparing models of customer behavior from populations to individuals 分而兴之:从群体到个人的顾客行为模型的比较
Fourth IEEE International Conference on Data Mining (ICDM'04) Pub Date : 2004-11-01 DOI: 10.1109/ICDM.2004.10013
Tianyi Jiang, A. Tuzhilin
{"title":"Divide and prosper: comparing models of customer behavior from populations to individuals","authors":"Tianyi Jiang, A. Tuzhilin","doi":"10.1109/ICDM.2004.10013","DOIUrl":"https://doi.org/10.1109/ICDM.2004.10013","url":null,"abstract":"This paper compares customer segmentation, 1-to-1, and aggregate marketing approaches across a broad range of experimental settings, including multiple segmentation levels, marketing datasets, dependent variables, and different types of classifiers, segmentation techniques, and predictive measures. Our experimental results show that, overall, 1-to-1 modeling significantly outperforms the aggregate approach among high-volume customers and is never worse than aggregate approach among low-volume customers. Moreover, the best segmentation techniques tend to outperform 1-to-l modeling among low-volume customers.","PeriodicalId":325511,"journal":{"name":"Fourth IEEE International Conference on Data Mining (ICDM'04)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131013036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信