Sixth International Conference on Data Mining (ICDM'06)最新文献

筛选
英文 中文
Enhancing Text Clustering Using Concept-based Mining Model 基于概念挖掘模型增强文本聚类
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.64
Shady Shehata, F. Karray, M. Kamel
{"title":"Enhancing Text Clustering Using Concept-based Mining Model","authors":"Shady Shehata, F. Karray, M. Kamel","doi":"10.1109/ICDM.2006.64","DOIUrl":"https://doi.org/10.1109/ICDM.2006.64","url":null,"abstract":"Most of text mining techniques are based on word and/or phrase analysis of the text. The statistical analysis of a term (word or phrase) frequency captures the importance of the term within a document. However, to achieve a more accurate analysis, the underlying mining technique should indicate terms that capture the semantics of the text from which the importance of a term in a sentence and in the document can be derived. A new concept-based mining model that relies on the analysis of both the sentence and the document, rather than, the traditional analysis of the document dataset only is introduced. The proposed mining model consists of a concept-based analysis of terms and a concept-based similarity measure. The term which contributes to the sentence semantics is analyzed with respect to its importance at the sentence and document levels. The model can efficiently find significant matching terms, either words or phrases, of the documents according to the semantics of the text. The similarity between documents relies on a new concept-based similarity measure which is applied to the matching terms between documents. Experiments using the proposed concept-based term analysis and similarity measure in text clustering are conducted. Experimental results demonstrate that the newly developed concept-based mining model enhances the clustering quality of sets of documents substantially.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131772239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 89
Semi-Supervised Kernel Regression 半监督核回归
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.143
Meng Wang, Xiansheng Hua, Yan Song, Lirong Dai, HongJiang Zhang
{"title":"Semi-Supervised Kernel Regression","authors":"Meng Wang, Xiansheng Hua, Yan Song, Lirong Dai, HongJiang Zhang","doi":"10.1109/ICDM.2006.143","DOIUrl":"https://doi.org/10.1109/ICDM.2006.143","url":null,"abstract":"Insufficiency of training data is a major obstacle in machine learning and data mining applications. Many different semi-supervised learning algorithms have been proposed to tackle this difficulty by leveraging a large amount of unlabeled data. However, most of them focus on semi-supervised classification. In this paper we propose a semi-supervised regression algorithm named semi-supervised kernel regression (SSKR). While classical kernel regression is only based on labeled examples, our approach extends it to all observed examples using a weighting factor to modulate the effect of unlabeled examples. Experimental results prove that SSKR significantly outperforms traditional kernel regression and graph-based semi-supervised regression methods.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133135562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams 基于窗口的高维多向流张量分析
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.169
Jimeng Sun, S. Papadimitriou, Philip S. Yu
{"title":"Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams","authors":"Jimeng Sun, S. Papadimitriou, Philip S. Yu","doi":"10.1109/ICDM.2006.169","DOIUrl":"https://doi.org/10.1109/ICDM.2006.169","url":null,"abstract":"Data stream values are often associated with multiple aspects. For example, each value from environmental sensors may have an associated type (e.g., temperature, humidity, etc) as well as location. Aside from timestamp, type and location are the two additional aspects. How to model such streams? How to simultaneously find patterns within and across the multiple aspects? How to do it incrementally in a streaming fashion? In this paper, all these problems are addressed through a general data model, tensor streams, and an effective algorithmic framework, window-based tensor analysis (WTA). Two variations of WTA, independent- window tensor analysis (IW) and moving-window tensor analysis (MW), are presented and evaluated extensively on real datasets. Finally, we illustrate one important application, multi-aspect correlation analysis (MACA), which uses WTA and we demonstrate its effectiveness on an environmental monitoring application.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115034517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 83
Searching for Pattern Rules 搜索模式规则
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.139
Guichong Li, Howard J. Hamilton
{"title":"Searching for Pattern Rules","authors":"Guichong Li, Howard J. Hamilton","doi":"10.1109/ICDM.2006.139","DOIUrl":"https://doi.org/10.1109/ICDM.2006.139","url":null,"abstract":"We address the problem of finding a set of pattern rules, from a transaction dataset given a statistical metric. A new data structure, called an incrementally counting suffix tree (ICST), is proposed for online computation of estimates of the support of any pattern or itemset. Using an ICST, our approach directly generates a set of pattern rules by a single scan of the whole dataset in partitions without the generation of frequent itemsets. Non-redundant rules can be found by removing redundancies from the pattern rules. The PPMCR algorithm first finds pattern rules and then non-redundant rules by generating valid candidates while traversing the ICST. Experimental results show that the PPMCR algorithm can be used for efficiently mining fewer non-redundant rules.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116027728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparisons of K-Anonymization and Randomization Schemes under Linking Attacks 链接攻击下k -匿名和随机化方案的比较
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.40
Zhouxuan Teng, Wenliang Du
{"title":"Comparisons of K-Anonymization and Randomization Schemes under Linking Attacks","authors":"Zhouxuan Teng, Wenliang Du","doi":"10.1109/ICDM.2006.40","DOIUrl":"https://doi.org/10.1109/ICDM.2006.40","url":null,"abstract":"Recently K-anonymity has gained popularity as a privacy quantification against linking attacks, in which attackers try to identify a record with values of some identifying attributes. If attacks succeed, the identity of the record will be revealed and potential confidential information contained in other attributes of the record will be disclosed. K-anonymity counters this attack by requiring that each record must be indistinguishable from at least K-1 other records with respect to the identifying attributes. Randomization can also be used for protection against linking attacks. In this paper, we compare the performance of K-anonymization and randomization schemes under linking attacks. We present a new privacy definition that can be applied to both k-anonymization and randomization. We compare these two schemes in terms of both utility and risks of privacy disclosure, and we promote to use R-U confidentiality map for such comparisons. We also compare various randomization schemes.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116128363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment 价值投资中共聚类股票与财务比率的最大拟曲线挖掘
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.111
Kelvin Sim, Jinyan Li, V. Gopalkrishnan, Guimei Liu
{"title":"Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment","authors":"Kelvin Sim, Jinyan Li, V. Gopalkrishnan, Guimei Liu","doi":"10.1109/ICDM.2006.111","DOIUrl":"https://doi.org/10.1109/ICDM.2006.111","url":null,"abstract":"We introduce an unsupervised process to co-cluster groups of stocks and financial ratios, so that investors can gain more insight on how they are correlated. Our idea for the co-clustering is based on a graph concept called maximal quasi-bicliques, which can tolerate erroneous or/and missing information that are common in the stock and financial ratio data. Compared to previous works, our maximal quasi-bicliques require the errors to be evenly distributed, which enable us to capture more meaningful co-clusters. We develop a new algorithm that can efficiently enumerate maximal quasi-bicliques from an undirected graph. The concept of maximal quasi-bicliques is domain-independent; it can be extended to perform co-clustering on any set of data that are modeled by graphs.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122759454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
A Novel Scalable Algorithm for Supervised Subspace Learning 一种新的可扩展的监督子空间学习算法
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.7
Jun Yan, Ning Liu, Benyu Zhang, Qiang Yang, Shuicheng Yan, Zheng Chen
{"title":"A Novel Scalable Algorithm for Supervised Subspace Learning","authors":"Jun Yan, Ning Liu, Benyu Zhang, Qiang Yang, Shuicheng Yan, Zheng Chen","doi":"10.1109/ICDM.2006.7","DOIUrl":"https://doi.org/10.1109/ICDM.2006.7","url":null,"abstract":"Subspace learning approaches aim to discover important statistical distribution on lower dimensions for high dimensional data. Methods such as principal component analysis (PCA) do not make use of the class information, and linear discriminant analysis (LDA) could not be performed efficiently in a scalable way. In this paper, we propose a novel highly scalable supervised subspace learning algorithm called as supervised Kampong measure (SKM). It assigns data points as close as possible to their corresponding class mean, simultaneously assigns data points to be as far as possible from the other class means in the transformed lower dimensional subspace. Theoretical derivation shows that our algorithm is not limited by the number of classes or the singularity problem faced by LDA. Furthermore, our algorithm can be executed in an incremental manner in which learning is done in an online fashion as data streams are received. Experimental results on several datasets, including a very large text data set RCV1, show the outstanding performance of our proposed algorithm on classification problems as compared to PCA, LDA and a popular feature selection approach, information gain (IG).","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122893849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Corrective Classification: Classifier Ensembling with Corrective and Diverse Base Learners 校正分类:分类器集成与校正和多样化的基础学习器
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.45
Yan Zhang, Xingquan Zhu, Xindong Wu
{"title":"Corrective Classification: Classifier Ensembling with Corrective and Diverse Base Learners","authors":"Yan Zhang, Xingquan Zhu, Xindong Wu","doi":"10.1109/ICDM.2006.45","DOIUrl":"https://doi.org/10.1109/ICDM.2006.45","url":null,"abstract":"Empirical studies on supervised learning have shown that ensembling methods lead to a model superior to the one built from a single learner under many circumstances especially when learning from imperfect, such as biased or noise infected, information sources. In this paper, we provide a novel corrective classification (C2) design, which incorporates error detection, data cleansing and Bootstrap sampling to construct base learners that constitute the classifier ensemble. The essential goal is to reduce noise impacts and eventually enhance the learners built from noise corrupted data. We further analyze the importance of both the accuracy and diversity of base learners in ensembling, in order to shed some light on the mechanism under which C2 works. Experimental comparisons will demonstrate that C2 is not only superior to the learner built from the original noisy sources, but also more reliable than bagging or the aggressive classifier ensemble (ACE), which are two degenerate components/variants of C2.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125157118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Deploying Approaches for Pattern Refinement in Text Mining 文本挖掘中模式细化的部署方法
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.50
Sheng-Tang Wu, Yuefeng Li, Yue Xu
{"title":"Deploying Approaches for Pattern Refinement in Text Mining","authors":"Sheng-Tang Wu, Yuefeng Li, Yue Xu","doi":"10.1109/ICDM.2006.50","DOIUrl":"https://doi.org/10.1109/ICDM.2006.50","url":null,"abstract":"Text mining is the technique that helps users find useful information from a large amount of digital text documents on the Web or databases. Instead of the keyword-based approach which is typically used in this field, the pattern-based model containing frequent sequential patterns is employed to perform the same concept of tasks. However, how to effectively use these discovered patterns is still a big challenge. In this study, we propose two approaches based on the use of pattern deploying strategies. The performance of the pattern deploying algorithms for text mining is investigated on the Reuters dataset RCVI and the results show that the effectiveness is improved by using our proposed pattern refinement approaches.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128261556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 150
Constructing Ensembles for Better Ranking 构建集成以获得更好的排名
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.42
Jin Huang, C. Ling
{"title":"Constructing Ensembles for Better Ranking","authors":"Jin Huang, C. Ling","doi":"10.1109/ICDM.2006.42","DOIUrl":"https://doi.org/10.1109/ICDM.2006.42","url":null,"abstract":"We propose a novel algorithm, RankDE, to build an ensemble using an extra artificial dataset. RankDE aims at improving the overall ranking performance, which is crucial in many machine learning applications. This algorithm constructs artificial datasets that are diverse with the current training dataset in terms of ranking. We conduct experiments with real-world data sets to compare RankDE with some traditional and state-of-the-art ensembling algorithms of Bagging, Adaboost, DECORATE and Rankboost in terms of ranking. The experiments show that RankDE outperforms Bagging, DECORATE, Adaboost, and Rankboost when limited data is available. When enough training data is available, it is competitive with DECORATE and Adaboost.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121673506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信