Sixth International Conference on Data Mining (ICDM'06)最新文献

筛选
英文 中文
Manifold Clustering of Shapes 形状的流形聚类
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.101
Dragomir Yankov, Eamonn J. Keogh
{"title":"Manifold Clustering of Shapes","authors":"Dragomir Yankov, Eamonn J. Keogh","doi":"10.1109/ICDM.2006.101","DOIUrl":"https://doi.org/10.1109/ICDM.2006.101","url":null,"abstract":"Shape clustering can significantly facilitate the automatic labeling of objects present in image collections. For example, it could outline the existing groups of pathological cells in a bank of cyto-images; the groups of species on photographs collected from certain aerials; or the groups of objects observed on surveillance scenes from an office building. Here we demonstrate that a nonlinear projection algorithm such as Isomap can attract together shapes of similar objects, suggesting the existence of isometry between the shape space and a low dimensional nonlinear embedding. Whenever there is a relatively small amount of noise in the data, the projection forms compact, convex clusters that can easily be learned by a subsequent partitioning scheme. We further propose a modification of the Isomap projection based on the concept of degree-bounded minimum spanning trees. The proposed approach is demonstrated to move apart bridged clusters and to alleviate the effect of noise in the data.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116517044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Identifying Follow-Correlation Itemset-Pairs 识别后续相关项集对
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.84
Shichao Zhang, Jilian Zhang, Xiaofeng Zhu, Zifang Huang
{"title":"Identifying Follow-Correlation Itemset-Pairs","authors":"Shichao Zhang, Jilian Zhang, Xiaofeng Zhu, Zifang Huang","doi":"10.1109/ICDM.2006.84","DOIUrl":"https://doi.org/10.1109/ICDM.2006.84","url":null,"abstract":"An association rule ArarrB is useful to predict that B will likely occur when A occurs. This is a classical association rule. In real world applications, such as bioinformatics and medical research, there are many follow correlations between itemsets A and B: B likely occurs n times after A occurred m times, wrote to <Am, BN>. We refer to this follow-correlation as P3.1 itemset-pairs because <A3, B1> like that in the example ( Example 2) should be uninterested in association analysis. This paper designs an efficient algorithm for identifying P3.1 itemset-pairs in sequential data. We experimentally evaluate our approach, and demonstrate that the proposed approach is efficient and promising.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134132880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Integrating Features from Different Sources for Music Information Retrieval 整合不同来源的音乐信息检索功能
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.89
Tao Li, M. Ogihara, Shenghuo Zhu
{"title":"Integrating Features from Different Sources for Music Information Retrieval","authors":"Tao Li, M. Ogihara, Shenghuo Zhu","doi":"10.1109/ICDM.2006.89","DOIUrl":"https://doi.org/10.1109/ICDM.2006.89","url":null,"abstract":"Efficient and intelligent music information retrieval is a very important topic of the 21st century. With the ultimate goal of building personal music information retrieval systems, this paper studies the problem of identifying \"similar\" artists using both lyrics and acoustic data. In this paper, we present a clustering algorithm that integrates features from both sources to perform bimodal learning. The algorithm is tested on a data set consisting of 570 songs from 53 albums of 41 artists using artist similarity provided by All Music Guide. Experimental results show that the accuracy of artist similarity classifiers can be significantly improved and that artist similarity can be efficiently identified.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134002341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Mining Latent Associations of Objects Using a Typed Mixture Model--A Case Study on Expert/Expertise Mining 使用类型化混合模型挖掘对象的潜在关联——以专家/专业知识挖掘为例
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.109
Shenghua Bao, Yunbo Cao, B. Liu, Yong Yu, Hang Li
{"title":"Mining Latent Associations of Objects Using a Typed Mixture Model--A Case Study on Expert/Expertise Mining","authors":"Shenghua Bao, Yunbo Cao, B. Liu, Yong Yu, Hang Li","doi":"10.1109/ICDM.2006.109","DOIUrl":"https://doi.org/10.1109/ICDM.2006.109","url":null,"abstract":"This paper studies the problem of discovering latent associations among objects in text documents. Specifically, given two sets of objects and various types of co-occurrence data concerning the objects existing in texts, we aim to discover the hidden or latent associative relationships between the two sets of objects. Existing methods are not directly applicable as they are unable to consider all this information. For example, the probabilistic mixture model called Separable Mixture Model (SMM) proposed by Hofmann can use only one type of co-occurrences to mine latent associations. This paper proposes a more general probabilistic mixture model called the Typed Separable Mixture Model (TSMM), which is able to use all types of co-occurrences within a single framework. Experimental results based on the expert/expertise mining task show that TSMM outperforms SMM significantly.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133244545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Meta Clustering 元聚类
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.103
R. Caruana, M. Elhawary, Nam Nguyen, Casey Smith
{"title":"Meta Clustering","authors":"R. Caruana, M. Elhawary, Nam Nguyen, Casey Smith","doi":"10.1109/ICDM.2006.103","DOIUrl":"https://doi.org/10.1109/ICDM.2006.103","url":null,"abstract":"Clustering is ill-defined. Unlike supervised learning where labels lead to crisp performance criteria such as accuracy and squared error, clustering quality depends on how the clusters will be used. Devising clustering criteria that capture what users need is difficult. Most clustering algorithms search for optimal clusterings based on a pre-specified clustering criterion. Our approach differs. We search for many alternate clusterings of the data, and then allow users to select the clustering(s) that best fit their needs. Meta clustering first finds a variety of clusterings and then clusters this diverse set of clusterings so that users must only examine a small number of qualitatively different clusterings. We present methods for automatically generating a diverse set of alternate clusterings, as well as methods for grouping clusterings into meta clusters. We evaluate meta clustering on four test problems and two case studies. Surprisingly, clusterings that would be of most interest to users often are not very compact clusterings.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133371129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 187
STAGGER: Periodicity Mining of Data Streams Using Expanding Sliding Windows STAGGER:使用扩展滑动窗口的数据流的周期性挖掘
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.153
Mohamed G. Elfeky, Walid G. Aref, A. Elmagarmid
{"title":"STAGGER: Periodicity Mining of Data Streams Using Expanding Sliding Windows","authors":"Mohamed G. Elfeky, Walid G. Aref, A. Elmagarmid","doi":"10.1109/ICDM.2006.153","DOIUrl":"https://doi.org/10.1109/ICDM.2006.153","url":null,"abstract":"Sensor devices are becoming ubiquitous, especially in measurement and monitoring applications. Because of the real-time, append-only and semi-infinite natures of the generated sensor data streams, an online incremental approach is a necessity for mining stream data types. In this paper, we propose STAGGER: a one-pass, online and incremental algorithm for mining periodic patterns in data streams. STAGGER does not require that the user pre-specify the periodicity rate of the data. Instead, STAGGER discovers the potential periodicity rates. STAGGER maintains multiple expanding sliding windows staggered over the stream, where computations are shared among the multiple overlapping windows. Small-length sliding windows are imperative for early and real-time output, yet are limited to discover short periodicity rates. As streamed data arrives continuously, the sliding windows expand in length in order to cover the whole stream. Larger-length sliding windows are able to discover longer periodicity rates. STAGGER incrementally maintains a tree-like data structure for the frequent periodic patterns of each discovered potential periodicity rate. In contrast to the Fourier/Wavelet-based approaches used for discovering periodicity rates, STAGGER not only discovers a wider, more accurate set of periodicities, but also discovers the periodic patterns themselves. In fact, experimental results with real and synthetic data sets show that STAGGER outperforms Fourier/Wavelet-based approaches by an order of magnitude in terms of the accuracy of the discovered periodicity rates. Moreover, real-data experiments demonstrate the practicality of the discovered periodic patterns.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"182 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116202227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
A Balanced Ensemble Approach to Weighting Classifiers for Text Classification 文本分类中加权分类器的平衡集成方法
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.2
G. Fung, J. Yu, Haixun Wang, D. Cheung, Huan Liu
{"title":"A Balanced Ensemble Approach to Weighting Classifiers for Text Classification","authors":"G. Fung, J. Yu, Haixun Wang, D. Cheung, Huan Liu","doi":"10.1109/ICDM.2006.2","DOIUrl":"https://doi.org/10.1109/ICDM.2006.2","url":null,"abstract":"This paper studies the problem of constructing an effective heterogeneous ensemble classifier for text classification. One major challenge of this problem is to formulate a good combination function, which combines the decisions of the individual classifiers in the ensemble. We show that the classification performance is affected by three weight components and they should be included in deriving an effective combination function. They are: (1) Global effectiveness, which measures the effectiveness of a member classifier in classifying a set of unseen documents; (2) Local effectiveness, which measures the effectiveness of a member classifier in classifying the particular domain of an unseen document; and (3) Decision confidence, which describes how confident a classifier is when making a decision when classifying a specific unseen document. We propose a new balanced combination function, called dynamic classifier weighting (DCW), that incorporates the aforementioned three components. The empirical study demonstrates that the new combination function is highly effective for text classification.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122019707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
An Interactive Semantic Video Mining and Retrieval Platform--Application in Transportation Surveillance Video for Incident Detection 交互式语义视频挖掘与检索平台——在交通监控视频事件检测中的应用
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.20
Xin Chen, Chengcui Zhang
{"title":"An Interactive Semantic Video Mining and Retrieval Platform--Application in Transportation Surveillance Video for Incident Detection","authors":"Xin Chen, Chengcui Zhang","doi":"10.1109/ICDM.2006.20","DOIUrl":"https://doi.org/10.1109/ICDM.2006.20","url":null,"abstract":"Understanding and retrieving videos based on their semantic contents is an important research topic in multimedia data mining and has found various real- world applications. Most existing video analysis techniques focus on the low level visual features of video data. However, there is a \"semantic gap\" between the machine-readable features and the high level human concepts i.e. human understanding of the video content. In this paper, an interactive platform for semantic video mining and retrieval is proposed using relevance feedback (RF), a popular technique in the area of content-based image retrieval (CBIR). By tracking semantic objects in a video and then modeling spatio-temporal events based on object trajectories and object interactions, the proposed interactive learning algorithm in the platform is able to mine the spatio-temporal data extracted from the video. An iterative learning process is involved in the proposed platform, which is guided by the user's response to the retrieved results. Although the proposed video retrieval platform is intended for general use and can be tailored to many applications, we focus on its application in traffic surveillance video database retrieval to demonstrate the design details. The effectiveness of the algorithm is demonstrated by our experiments on real-life traffic surveillance videos.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124838853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Direct Marketing When There Are Voluntary Buyers 有自愿购买者时的直接营销
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.54
Yi-Ting Lai, Ke Wang, Daymond Ling, Hua Shi, Jason J. Zhang
{"title":"Direct Marketing When There Are Voluntary Buyers","authors":"Yi-Ting Lai, Ke Wang, Daymond Ling, Hua Shi, Jason J. Zhang","doi":"10.1109/ICDM.2006.54","DOIUrl":"https://doi.org/10.1109/ICDM.2006.54","url":null,"abstract":"In traditional direct marketing, the implicit assumption is that customers will only purchase the product if they are contacted. In real business environments, however, there are \"voluntary buyers, \" who will still make the purchase in the absence of a contact. While no direct promotion is needed for voluntary buyers, the traditional response-driven paradigm tends to target such customers. This paper presents \"influential marketing, \" targeting only those whose purchase decisions can be positively influenced, i.e. buyers who are non-voluntary. Our novel, practical solution to this problem gives promising results.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130089082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Exploratory Under-Sampling for Class-Imbalance Learning 班级不平衡学习的探索性欠抽样
Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI: 10.1109/ICDM.2006.68
Xu-Ying Liu, Jianxin Wu, Zhi-Hua Zhou
{"title":"Exploratory Under-Sampling for Class-Imbalance Learning","authors":"Xu-Ying Liu, Jianxin Wu, Zhi-Hua Zhou","doi":"10.1109/ICDM.2006.68","DOIUrl":"https://doi.org/10.1109/ICDM.2006.68","url":null,"abstract":"Under-sampling is a class-imbalance learning method which uses only a subset of major class examples and thus is very efficient. The main deficiency is that many major class examples are ignored. We propose two algorithms to overcome the deficiency. EasyEnsemble samples several subsets from the major class, trains a learner using each of them, and combines the outputs of those learners. BalanceCascade is similar to EasyEnsemble except that it removes correctly classified major class examples of trained learners from further consideration. Experiments show that both of the proposed algorithms have better AUC scores than many existing class-imbalance learning methods. Moreover, they have approximately the same training time as that of under-sampling, which trains significantly faster than other methods.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130419350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1475
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信