2002 IEEE International Conference on Data Mining, 2002. Proceedings.最新文献

筛选
英文 中文
gSpan: graph-based substructure pattern mining gSpan:基于图的子结构模式挖掘
2002 IEEE International Conference on Data Mining, 2002. Proceedings. Pub Date : 2002-12-09 DOI: 10.1109/ICDM.2002.1184038
Xifeng Yan, Jiawei Han
{"title":"gSpan: graph-based substructure pattern mining","authors":"Xifeng Yan, Jiawei Han","doi":"10.1109/ICDM.2002.1184038","DOIUrl":"https://doi.org/10.1109/ICDM.2002.1184038","url":null,"abstract":"We investigate new approaches for frequent graph-based pattern mining in graph datasets and propose a novel algorithm called gSpan (graph-based substructure pattern mining), which discovers frequent substructures without candidate generation. gSpan builds a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label. Based on this lexicographic order gSpan adopts the depth-first search strategy to mine frequent connected subgraphs efficiently. Our performance study shows that gSpan substantially outperforms previous algorithms, sometimes by an order of magnitude.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115603836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2414
Wavelet based UXO detection 基于小波的未爆弹药检测
2002 IEEE International Conference on Data Mining, 2002. Proceedings. Pub Date : 2002-12-09 DOI: 10.1109/ICDM.2002.1184012
S. Hodgson, N. Dunstan, R. Murison
{"title":"Wavelet based UXO detection","authors":"S. Hodgson, N. Dunstan, R. Murison","doi":"10.1109/ICDM.2002.1184012","DOIUrl":"https://doi.org/10.1109/ICDM.2002.1184012","url":null,"abstract":"The detection and classification of unexploded ordnance (UXO) is considered a multidimensional pattern recognition problem. Standard techniques in solving multidimensional detection and classification problems involve using large sets of templates or libraries. This paper shows that by using wavelet transformation a single library will allow a particular class of ordnance to be classified over a range of depths.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130389324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Using functional PCA for cardiac motion exploration 应用功能性PCA进行心脏运动探查
2002 IEEE International Conference on Data Mining, 2002. Proceedings. Pub Date : 2002-12-09 DOI: 10.1109/ICDM.2002.1183890
D. Clot
{"title":"Using functional PCA for cardiac motion exploration","authors":"D. Clot","doi":"10.1109/ICDM.2002.1183890","DOIUrl":"https://doi.org/10.1109/ICDM.2002.1183890","url":null,"abstract":"Principal component analysis (PCA) is a major tool in multivariate data analysis. Its paradigms are also used in Karhunen-Loeve decomposition, a standard tool in image processing. Extensions of PCA to the framework of functional data have been proposed. The analysis provided by functional PCA seems to be a powerful tool for finding principal sources of variability in curves or images, but fails to provide easy interpretations in the case of multifunctional data. Guidelines aiming at spot information from the outputs of PCA applied to functionals with values in the space of continuous functions upon a bounded domain are proposed. An application to cardiac motion analysis illustrates the complexity of the multifunctional framework and the results provided by functional PCA.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130902324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Speed-up iterative frequent itemset mining with constraint changes 加速了约束变化的迭代频繁项集挖掘
2002 IEEE International Conference on Data Mining, 2002. Proceedings. Pub Date : 2002-12-09 DOI: 10.1109/ICDM.2002.1183892
G. Cong, B. Liu
{"title":"Speed-up iterative frequent itemset mining with constraint changes","authors":"G. Cong, B. Liu","doi":"10.1109/ICDM.2002.1183892","DOIUrl":"https://doi.org/10.1109/ICDM.2002.1183892","url":null,"abstract":"Mining of frequent itemsets is a fundamental data mining task. Past research has proposed many efficient algorithms for this purpose. Recent work also highlighted the importance of using constraints to focus the mining process to mine only those relevant itemsets. In practice, data mining is often an interactive and iterative process. The user typically changes constraints and runs the mining algorithm many times before being satisfied with the final results. This interactive process is very time consuming. Existing mining algorithms are unable to take advantage of this iterative process to use previous mining results to speed up the current mining process. This results in an enormous waste of time and computation. In this paper, we propose an efficient technique to utilize previous mining results to improve the efficiency of current mining when constraints are changed. We first introduce the concept of tree boundary to summarize useful information available from previous mining. We then show that the tree boundary provides an effective and efficient framework for the new mining. The proposed technique has been implemented in the context of two existing frequent itemset mining algorithms, FP-tree and tree projection. Experiment results on both synthetic and real-life datasets show that the proposed approach achieves a dramatic saving of computation.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129356792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
FD/spl I.bar/Mine: discovering functional dependencies in a database using equivalences FD/spl .bar/Mine:使用等价发现数据库中的功能依赖
2002 IEEE International Conference on Data Mining, 2002. Proceedings. Pub Date : 2002-12-09 DOI: 10.1109/ICDM.2002.1184040
Hong Yao, Howard J. Hamilton, C. Butz
{"title":"FD/spl I.bar/Mine: discovering functional dependencies in a database using equivalences","authors":"Hong Yao, Howard J. Hamilton, C. Butz","doi":"10.1109/ICDM.2002.1184040","DOIUrl":"https://doi.org/10.1109/ICDM.2002.1184040","url":null,"abstract":"The discovery of FDs from databases has recently become a significant research problem. In this paper, we propose a new algorithm, called FD-Mine. FD-Mine takes advantage of the rich theory of FDs to reduce both the size of the dataset and the number of FDs to be checked by using discovered equivalences. We show that the pruning does not lead to loss of information. Experiments on 15 UCI datasets show that FD-Mine can prune more candidates than previous methods.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124970648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Mining online users' access records for web business intelligence 挖掘在线用户访问记录,实现网络商业智能
2002 IEEE International Conference on Data Mining, 2002. Proceedings. Pub Date : 2002-12-09 DOI: 10.1109/ICDM.2002.1184047
S. Fong, Serena Chan
{"title":"Mining online users' access records for web business intelligence","authors":"S. Fong, Serena Chan","doi":"10.1109/ICDM.2002.1184047","DOIUrl":"https://doi.org/10.1109/ICDM.2002.1184047","url":null,"abstract":"This paper discusses about how business intelligence on a website could be obtained from users' access records instead of web logs of \"hits\". Users' access records are captured by implementing an Access-Control (AC) architectural model on the website. This model requires users to register their profiles in an exchange of a password; and thereafter they have to login before gaining access to certain resources on the website. The links to the resources on the website have been modified such that a record of information about the access would be recorded in the database when clicked. This way, datamining can be performed on a relatively clean set of access records about the users. Hence, a good deal of business intelligence about the users' behaviors, preferences and about the popularities of the resources (products) on the website can be gained. In this paper, we also discussed how the business intelligence acquired, in turn, can be used to provide e-CRM for the users.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115038667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On a capacity control using Boolean kernels for the learning of Boolean functions 用布尔核学习布尔函数的容量控制
2002 IEEE International Conference on Data Mining, 2002. Proceedings. Pub Date : 2002-12-09 DOI: 10.1109/ICDM.2002.1183934
Ken Sadohara
{"title":"On a capacity control using Boolean kernels for the learning of Boolean functions","authors":"Ken Sadohara","doi":"10.1109/ICDM.2002.1183934","DOIUrl":"https://doi.org/10.1109/ICDM.2002.1183934","url":null,"abstract":"This paper concerns the classification task in discrete attribute spaces, but considers the task in a more fundamental framework: the learning of Boolean functions. The purpose of this paper is to present a new learning algorithm for Boolean functions called Boolean kernel classifier (BKC) employing capacity control using Boolean kernels. BKC uses support vector machines (SVMs) as learning engines and Boolean kernels are primarily used for running SVMs in feature spaces spanned by conjunctions of Boolean literals. However, another important role of Boolean kernels is to appropriately control the size of its hypothesis space, to avoid overfitting. After applying a SVM to learn a classifier f in a feature space H induced by a Boolean kernel, BKC uses another Boolean kernel to compute the projections f/sup k/ of f onto a subspace H/sub k/ of H spanned by conjunctions with length at most k. By evaluating the accuracy of f/sup k/ on training data for any k, BKC can determine the smallest k such that f/sup k/ is as accurate as f and learn another f' in H/sub k/ expected to have lower error for unseen data. By an empirical study on learning of randomly generated Boolean functions, it is shown that the capacity control is effective, and BKC outperforms C4.5 and naive Bayes classifiers.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132577775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Mining similar temporal patterns in long time-series data and its application to medicine 长时间序列数据中相似时间模式的挖掘及其在医学中的应用
2002 IEEE International Conference on Data Mining, 2002. Proceedings. Pub Date : 2002-12-09 DOI: 10.1109/ICDM.2002.1183906
S. Hirano, S. Tsumoto
{"title":"Mining similar temporal patterns in long time-series data and its application to medicine","authors":"S. Hirano, S. Tsumoto","doi":"10.1109/ICDM.2002.1183906","DOIUrl":"https://doi.org/10.1109/ICDM.2002.1183906","url":null,"abstract":"Data mining in time-series medical databases has been receiving considerable attention since it provides a way of revealing useful information hidden in the database; for example relationships between temporal course of examination results and onset time of diseases. This paper presents a new method for finding similar patterns in temporal sequences. The method is a hybridization of phase-constraint multiscale matching and rough clustering. Multiscale matching enables us cross-scale comparison of the sequences, namely, it enable us to compare temporal patterns by partially changing observation scales. Rough clustering enable us to construct interpretable clusters of the sequences even if their similarities are given as relative similarities. We combine these methods and cluster the sequences according to multiscale similarity of patterns. Experimental results on the chronic hepatitis dataset showed that clusters demonstrating interesting temporal patterns were successfully discovered.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132620217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 50
Clustering spatial data when facing physical constraints 面对物理约束时的空间数据聚类
2002 IEEE International Conference on Data Mining, 2002. Proceedings. Pub Date : 2002-12-09 DOI: 10.1109/ICDM.2002.1184042
Osmar R Zaiane, Chi-Hoon Lee
{"title":"Clustering spatial data when facing physical constraints","authors":"Osmar R Zaiane, Chi-Hoon Lee","doi":"10.1109/ICDM.2002.1184042","DOIUrl":"https://doi.org/10.1109/ICDM.2002.1184042","url":null,"abstract":"Clustering spatial data is a well-known problem that has been extensively studied to find hidden patterns or meaningful sub-groups and has many applications such as satellite imagery, geographic information systems, medical image analysis, etc. Although many methods have been proposed in the literature, very few have considered constraints such that physical obstacles and bridges linking clusters may have significant consequences on the effectiveness of the clustering. Taking into account these constraints during the clustering process is costly, and the effective modeling of the constraints is of paramount importance for good performance. In this paper we define the clustering problem in the presence of constraints - obstacles and crossings - and investigate its efficiency and effectiveness for large databases. In addition, we introduce a new approach to model these constraints to prune the search space and reduce the number of polygons to test during clustering. The algorithm DBCluC we present detects clusters of arbitrary shape and is insensitive to noise and the input order Its average running complexity is O(NlogN) where N is the number of data objects.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132960194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Recognition of common areas in a Web page using visual information: a possible application in a page classification 使用视觉信息识别Web页面中的公共区域:在页面分类中可能的应用
2002 IEEE International Conference on Data Mining, 2002. Proceedings. Pub Date : 2002-12-09 DOI: 10.1109/ICDM.2002.1183910
M. Kovačević, Michelangelo Diligenti, M. Gori, V. Milutinovic
{"title":"Recognition of common areas in a Web page using visual information: a possible application in a page classification","authors":"M. Kovačević, Michelangelo Diligenti, M. Gori, V. Milutinovic","doi":"10.1109/ICDM.2002.1183910","DOIUrl":"https://doi.org/10.1109/ICDM.2002.1183910","url":null,"abstract":"Extracting and processing information from Web pages is an important task in many areas like constructing search engines, information retrieval, and data mining from the Web. A common approach in the extraction process is to represent a page as a \"bag of words\" and then to perform additional processing on such a flat representation. We propose a new, hierarchical representation that includes browser screen coordinates for every HTML object in a page. Using visual information one is able to define heuristics for the recognition of common page areas such as header, left and right menu, footer and center of a page. We show in initial experiments that using our heuristics defined objects are recognized properly in 73% of cases. Finally, we show that a Naive Bayes classifier, taking into account the proposed representation, clearly outperforms the same classifier using only information about the content of documents.","PeriodicalId":405340,"journal":{"name":"2002 IEEE International Conference on Data Mining, 2002. Proceedings.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126362305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 132
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信