Mining Subspace Correlations

R. Harpaz, R. Haralick
{"title":"Mining Subspace Correlations","authors":"R. Harpaz, R. Haralick","doi":"10.1109/CIDM.2007.368893","DOIUrl":null,"url":null,"abstract":"In recent applications of clustering such as gene expression microarray analysis, collaborative filtering, and Web mining, object similarity is no longer measured by physical distance, but rather by the behavior patterns objects manifest or the magnitude of correlations they induce. Current state of the art algorithms aiming at this type of clustering typically postulate specific cluster models that are able to capture only specific behavior patterns or correlations, and omit the possibility that other information carrying patterns or correlations may coexist in the data. We cast the problem of searching for pattern clusters or clusters that induce large correlations in some subset of features into the problem of searching for groups of points embedded in lines. The advantage of this approach is that is allows the clustering of different patterns or correlations simultaneously. It also allows the clustering of patterns and correlations that are overlooked by existing methods. A formal stochastic line cluster model is presented and its connection to correlation is established. Based on this model an algorithm, which uses feature selection to search for line clusters embedded in subspaces of the data is presented","PeriodicalId":423707,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Data Mining","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Symposium on Computational Intelligence and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIDM.2007.368893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

In recent applications of clustering such as gene expression microarray analysis, collaborative filtering, and Web mining, object similarity is no longer measured by physical distance, but rather by the behavior patterns objects manifest or the magnitude of correlations they induce. Current state of the art algorithms aiming at this type of clustering typically postulate specific cluster models that are able to capture only specific behavior patterns or correlations, and omit the possibility that other information carrying patterns or correlations may coexist in the data. We cast the problem of searching for pattern clusters or clusters that induce large correlations in some subset of features into the problem of searching for groups of points embedded in lines. The advantage of this approach is that is allows the clustering of different patterns or correlations simultaneously. It also allows the clustering of patterns and correlations that are overlooked by existing methods. A formal stochastic line cluster model is presented and its connection to correlation is established. Based on this model an algorithm, which uses feature selection to search for line clusters embedded in subspaces of the data is presented
挖掘子空间相关性
在最近的聚类应用中,如基因表达微阵列分析、协同过滤和Web挖掘,对象相似性不再通过物理距离来衡量,而是通过对象表现出的行为模式或它们引起的相关性的大小来衡量。针对这种类型的聚类的当前最先进的算法通常假设特定的聚类模型,这些模型只能捕获特定的行为模式或相关性,并且忽略了数据中可能共存的其他携带信息的模式或相关性的可能性。我们将搜索模式簇或在某些特征子集中引起大相关性的簇的问题转换为搜索嵌入在线中的点组的问题。这种方法的优点是允许同时对不同的模式或相关性进行聚类。它还允许对现有方法所忽略的模式和相关性进行聚类。提出了一种形式化的随机线形聚类模型,并建立了其与相关性的联系。在此基础上,提出了一种利用特征选择来搜索嵌入在数据子空间中的行聚类的算法
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信