Sixth International Conference on Data Mining (ICDM'06)最新文献_第9页

Enhancing Text Clustering Using Concept-based Mining Model 基于概念挖掘模型增强文本聚类

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.64

Shady Shehata, F. Karray, M. Kamel

{"title":"Enhancing Text Clustering Using Concept-based Mining Model","authors":"Shady Shehata, F. Karray, M. Kamel","doi":"10.1109/ICDM.2006.64","DOIUrl":"https://doi.org/10.1109/ICDM.2006.64","url":null,"abstract":"Most of text mining techniques are based on word and/or phrase analysis of the text. The statistical analysis of a term (word or phrase) frequency captures the importance of the term within a document. However, to achieve a more accurate analysis, the underlying mining technique should indicate terms that capture the semantics of the text from which the importance of a term in a sentence and in the document can be derived. A new concept-based mining model that relies on the analysis of both the sentence and the document, rather than, the traditional analysis of the document dataset only is introduced. The proposed mining model consists of a concept-based analysis of terms and a concept-based similarity measure. The term which contributes to the sentence semantics is analyzed with respect to its importance at the sentence and document levels. The model can efficiently find significant matching terms, either words or phrases, of the documents according to the semantics of the text. The similarity between documents relies on a new concept-based similarity measure which is applied to the matching terms between documents. Experiments using the proposed concept-based term analysis and similarity measure in text clustering are conducted. Experimental results demonstrate that the newly developed concept-based mining model enhances the clustering quality of sets of documents substantially.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131772239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 89

Semi-Supervised Kernel Regression 半监督核回归

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.143

Meng Wang, Xiansheng Hua, Yan Song, Lirong Dai, HongJiang Zhang

引用次数: 20

Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams 基于窗口的高维多向流张量分析

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.169

Jimeng Sun, S. Papadimitriou, Philip S. Yu

引用次数: 83

Searching for Pattern Rules 搜索模式规则

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.139

Guichong Li, Howard J. Hamilton

引用次数: 0

Comparisons of K-Anonymization and Randomization Schemes under Linking Attacks 链接攻击下k -匿名和随机化方案的比较

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.40

Zhouxuan Teng, Wenliang Du

引用次数: 18

Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment 价值投资中共聚类股票与财务比率的最大拟曲线挖掘

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.111

Kelvin Sim, Jinyan Li, V. Gopalkrishnan, Guimei Liu

引用次数: 57

A Novel Scalable Algorithm for Supervised Subspace Learning 一种新的可扩展的监督子空间学习算法

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.7

Jun Yan, Ning Liu, Benyu Zhang, Qiang Yang, Shuicheng Yan, Zheng Chen

{"title":"A Novel Scalable Algorithm for Supervised Subspace Learning","authors":"Jun Yan, Ning Liu, Benyu Zhang, Qiang Yang, Shuicheng Yan, Zheng Chen","doi":"10.1109/ICDM.2006.7","DOIUrl":"https://doi.org/10.1109/ICDM.2006.7","url":null,"abstract":"Subspace learning approaches aim to discover important statistical distribution on lower dimensions for high dimensional data. Methods such as principal component analysis (PCA) do not make use of the class information, and linear discriminant analysis (LDA) could not be performed efficiently in a scalable way. In this paper, we propose a novel highly scalable supervised subspace learning algorithm called as supervised Kampong measure (SKM). It assigns data points as close as possible to their corresponding class mean, simultaneously assigns data points to be as far as possible from the other class means in the transformed lower dimensional subspace. Theoretical derivation shows that our algorithm is not limited by the number of classes or the singularity problem faced by LDA. Furthermore, our algorithm can be executed in an incremental manner in which learning is done in an online fashion as data streams are received. Experimental results on several datasets, including a very large text data set RCV1, show the outstanding performance of our proposed algorithm on classification problems as compared to PCA, LDA and a popular feature selection approach, information gain (IG).","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122893849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Corrective Classification: Classifier Ensembling with Corrective and Diverse Base Learners 校正分类:分类器集成与校正和多样化的基础学习器

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.45

Yan Zhang, Xingquan Zhu, Xindong Wu

引用次数: 4

Deploying Approaches for Pattern Refinement in Text Mining 文本挖掘中模式细化的部署方法

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.50

Sheng-Tang Wu, Yuefeng Li, Yue Xu

引用次数: 150

Constructing Ensembles for Better Ranking 构建集成以获得更好的排名

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.42

Jin Huang, C. Ling

引用次数: 0