2008 IEEE International Conference on Data Mining Workshops最新文献

筛选
英文 中文
Mining Temporal Patterns with Quantitative Intervals 利用定量间隔挖掘时间模式
2008 IEEE International Conference on Data Mining Workshops Pub Date : 2008-12-15 DOI: 10.1109/ICDMW.2008.16
Thomas Guyet, R. Quiniou
{"title":"Mining Temporal Patterns with Quantitative Intervals","authors":"Thomas Guyet, R. Quiniou","doi":"10.1109/ICDMW.2008.16","DOIUrl":"https://doi.org/10.1109/ICDMW.2008.16","url":null,"abstract":"In this paper we consider the problem of discovering frequent temporal patterns in a database of temporal sequences, where a temporal sequence is a set of items with associated dates and durations. Since the quantitative temporal information appears to be fundamental in many contexts, it is taken into account in the mining processes and returned as part of the extracted knowledge. To this end, we have adapted the classical a priori (Agrawal and Srikant, 1995) framework to propose an efficient algorithm based on a hyper-cube representation of temporal sequences. The extraction of quantitative temporal information is performed using a density estimation of the distribution of event intervals from the temporal sequences. An evaluation on synthetic data sets shows that the proposed algorithm can robustly extract frequent temporal patterns with quantitative temporal extents.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130914839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Identification of Causal Variables for Building Energy Fault Detection by Semi-supervised LDA and Decision Boundary Analysis 基于半监督LDA和决策边界分析的建筑能源故障因果变量识别
2008 IEEE International Conference on Data Mining Workshops Pub Date : 2008-12-15 DOI: 10.1109/ICDMW.2008.44
Keigo Yoshida, M. Inui, T. Yairi, K. Machida, Masaki Shioya, Y. Masukawa
{"title":"Identification of Causal Variables for Building Energy Fault Detection by Semi-supervised LDA and Decision Boundary Analysis","authors":"Keigo Yoshida, M. Inui, T. Yairi, K. Machida, Masaki Shioya, Y. Masukawa","doi":"10.1109/ICDMW.2008.44","DOIUrl":"https://doi.org/10.1109/ICDMW.2008.44","url":null,"abstract":"This paper addresses the identification problem of causal variables for the system anomaly. In real-world complicated systems, even experts often fail to specify causal factors, thus they attempt to detect the anomaly with exploratory heuristics. Our goal is to offer further information that supports anomaly cause analysis using the incomplete empirical knowledge. Proposed technique discovers responsible factors for the fault by leveraging domain knowledge with an effective combination of semi-supervised linear discriminant analysis (LDA) and boundary-based discriminative subspace identification method. Experimental results on synthetic and real dataset confirmed validity of our approach. Moreover, we applied this method to the building energy fault diagnosis and succeeded in extracting causal variables for energy waste in a building.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115377070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Hierarchical Text Categorization in a Transductive Setting 转换设置中的层次文本分类
2008 IEEE International Conference on Data Mining Workshops Pub Date : 2008-12-15 DOI: 10.1109/ICDMW.2008.126
Michelangelo Ceci
{"title":"Hierarchical Text Categorization in a Transductive Setting","authors":"Michelangelo Ceci","doi":"10.1109/ICDMW.2008.126","DOIUrl":"https://doi.org/10.1109/ICDMW.2008.126","url":null,"abstract":"Transductive learning is the learning setting that permits to learn from \"particular to particular'' and to consider both labelled and unlabelled examples when taking classification decisions. In this paper, we investigate the use of transductive learning in the context of hierarchical text categorization. At this aim, we exploit a modified version of an inductive hierarchical learning framework that permits to classify documents in internal and leaf nodes of a hierarchy of categories. Experimental results on real world datasets are reported.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115455015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Title-Composing Support System for Reaching New Audiences 面向新受众的片名创作支持系统
2008 IEEE International Conference on Data Mining Workshops Pub Date : 2008-12-15 DOI: 10.1109/ICDMW.2008.24
Yoko Nishihara, W. Sunayama
{"title":"Title-Composing Support System for Reaching New Audiences","authors":"Yoko Nishihara, W. Sunayama","doi":"10.1109/ICDMW.2008.24","DOIUrl":"https://doi.org/10.1109/ICDMW.2008.24","url":null,"abstract":"This paper proposes a support system for composing good titles for research papers in order to reach new audiences. Our system takes titles as input. The system evaluates title understandability and interest level of a title. The system ranks titles and outputs a title list. Users are able to recompose their titles by referring to the list and each evaluation value. Using the system, users can obtain new audiences who have not previously been interested in the userpsilas research area. Experimental results showed that our system is able to rank titles in descending order of audiencespsila choices.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"283 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115634224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Efficient Distance Computation Using SQL Queries and UDFs 使用SQL查询和udf进行有效的距离计算
2008 IEEE International Conference on Data Mining Workshops Pub Date : 2008-12-15 DOI: 10.1109/ICDMW.2008.135
Sasi K. Pitchaimalai, C. Ordonez, Carlos Garcia-Alvarado
{"title":"Efficient Distance Computation Using SQL Queries and UDFs","authors":"Sasi K. Pitchaimalai, C. Ordonez, Carlos Garcia-Alvarado","doi":"10.1109/ICDMW.2008.135","DOIUrl":"https://doi.org/10.1109/ICDMW.2008.135","url":null,"abstract":"Distance computation is one of the most computationally intensive operations employed by many data mining algorithms. Performing such matrix computations within a DBMS creates many optimization challenges. We propose techniques to efficiently compute Euclidean distance using SQL queries and user-defined functions (UDFs). We concentrate on efficient Euclidean distance computation for the well-known K-means clustering algorithm. We present SQL query optimizations and a scalar UDF to compute Euclidean distance. We experimentally evaluate performance and scalability of our proposed SQL queries and UDF with large data sets on a modern DBMS. We benchmark distance computation on two important data mining techniques: clustering and classification. In general, UDFs are faster than SQL queries because they are executed in main memory. Data set size is the main factor impacting performance, followed by data set dimensionality.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"514 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116207931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Interactive Exploration of Model-Based Automatically Extracted Data 基于模型的自动提取数据的交互式探索
2008 IEEE International Conference on Data Mining Workshops Pub Date : 2008-12-15 DOI: 10.1109/ICDMW.2008.34
A. Coden, I. Sominsky, M. Tanenblatt
{"title":"Interactive Exploration of Model-Based Automatically Extracted Data","authors":"A. Coden, I. Sominsky, M. Tanenblatt","doi":"10.1109/ICDMW.2008.34","DOIUrl":"https://doi.org/10.1109/ICDMW.2008.34","url":null,"abstract":"We present an interactive system to query, explore and navigate data according to a hierarchical knowledge model that had been automatically populated from unstructured textual data. Our system differs from systems assisting in the navigation of domain ontologies and mining between pairs of concepts in that it enables access to unstructured data by abstract concepts and relations between them. Concepts in turn are specified by sets of models and their relations. However, some concepts may not have a direct representation in the text. In particular, the demonstration query by model/cancer (QbM/C) is based on unstructured pathology reports. The knowledge model represents both named entities such as diagnosis and anatomical site, and higher level concepts such as primary and metastatic tumor. Such concepts are based on the relations between named entities. We will present the data layout and access mechanism from the GUI to the data.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116432400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Robust Graph-Based Algorithm for Detection and Characterization of Anomalies in Noisy Multivariate Time Series 一种基于图的鲁棒噪声多元时间序列异常检测与表征算法
2008 IEEE International Conference on Data Mining Workshops Pub Date : 2008-12-15 DOI: 10.1109/ICDMW.2008.48
H. Cheng, P. Tan, C. Potter, S. Klooster
{"title":"A Robust Graph-Based Algorithm for Detection and Characterization of Anomalies in Noisy Multivariate Time Series","authors":"H. Cheng, P. Tan, C. Potter, S. Klooster","doi":"10.1109/ICDMW.2008.48","DOIUrl":"https://doi.org/10.1109/ICDMW.2008.48","url":null,"abstract":"Detection of anomalies in multivariate time series is an important data mining task with potential applications in medical diagnosis, ecosystem modeling, and network traffic monitoring. In this paper, we present a robust graph-based algorithm for detecting anomalies in noisy multivariate time series data. A key feature of the algorithm is the alignment of kernel matrices constructed from the time series. The aligned kernel enables the algorithm to capture the dependence relationship between different time series and to support the discovery of different types of anomalies (including subsequence-based and local anomalies). We have performed extensive experiments to demonstrate the effectiveness of the proposed algorithm. We also present a case study that shows the utility of applying our algorithm to detect ecosystem disturbances in Earth science data.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126086933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
ARUBAS: An Association Rule Based Similarity Framework for Associative Classifiers 基于关联规则的关联分类器相似度框架
2008 IEEE International Conference on Data Mining Workshops Pub Date : 2008-12-15 DOI: 10.1109/ICDMW.2008.58
B. Depaire, K. Vanhoof, G. Wets
{"title":"ARUBAS: An Association Rule Based Similarity Framework for Associative Classifiers","authors":"B. Depaire, K. Vanhoof, G. Wets","doi":"10.1109/ICDMW.2008.58","DOIUrl":"https://doi.org/10.1109/ICDMW.2008.58","url":null,"abstract":"This article introduces ARUBAS, a new framework to build associative classifiers. In contrast with many existing associative classifiers, it uses class association rules to transform the feature space and uses instance-based reasoning to classify new instances. The framework allows the researcher to use any association rule mining algorithm to produce the class association rules. Every aspect of the framework is extensively introduced and discussed and five different fitness measures used for classification purposes are defined. The empirical results determine which fitness measure is the best and compares the framework with other classifiers. These results show that the ARUBAS framework is able to produce associative classifiers which are competitive with other classification techniques. More specifically, with ARUBAS-Scheffer-phi5 we have introduced a parameter-free algorithm which is competitive with classification techniques such as C4.5, RIPPER and CBA.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126165619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Exploiting Data Semantics to Discover, Extract, and Model Web Sources 利用数据语义来发现、提取和建模Web源
2008 IEEE International Conference on Data Mining Workshops Pub Date : 2008-12-15 DOI: 10.1109/ICDMW.2008.134
J. Ambite, Craig A. Knoblock, Kristina Lerman, Anon Plangprasopchok, Thomas A. Russ, Cenk Gazen, Steven Minton, Mark James Carman
{"title":"Exploiting Data Semantics to Discover, Extract, and Model Web Sources","authors":"J. Ambite, Craig A. Knoblock, Kristina Lerman, Anon Plangprasopchok, Thomas A. Russ, Cenk Gazen, Steven Minton, Mark James Carman","doi":"10.1109/ICDMW.2008.134","DOIUrl":"https://doi.org/10.1109/ICDMW.2008.134","url":null,"abstract":"We describe Deimos, a system that automatically discovers and models new sources of information.The system exploits four core technologies developed by our group that makes an end-to-end solution to this problem possible. First, given an example source, Deimos finds other similar sources online. Second, it invokes and extracts data from these sources. Third, given the syntactic structure of a source, Deimos maps its inputs and outputs to semantic types. Finally, it infers the source's semantic definition, i.e., the function that maps the inputs to the outputs. Deimos is able to successfully automate these steps by exploiting a combination of background knowledge and data semantics. We describe the challenges in integrating separate components into a unified approach to discovering, extracting and modeling new online sources. We provide an end-to-end validation of the system in two information domains to show that it can successfully discover and model new data sources in those domains.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126222534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Unifying Unknown Nodes in the Internet Graph Using Semisupervised Spectral Clustering 利用半监督谱聚类统一互联网图中的未知节点
2008 IEEE International Conference on Data Mining Workshops Pub Date : 2008-12-15 DOI: 10.1109/ICDMW.2008.12
Anat Almog, J. Goldberger, Y. Shavitt
{"title":"Unifying Unknown Nodes in the Internet Graph Using Semisupervised Spectral Clustering","authors":"Anat Almog, J. Goldberger, Y. Shavitt","doi":"10.1109/ICDMW.2008.12","DOIUrl":"https://doi.org/10.1109/ICDMW.2008.12","url":null,"abstract":"Most research on Internet topology is based on active measurement methods. A major difficulty in using these tools is that one comes across many unresponsive routers. Different methods of dealing with these anonymous nodes to preserve the connectivity of the real graph have been suggested. One of the more practical approaches involves using a placeholder for each unknown, resulting in multiple copies of every such node. This significantly distorts and inflates the inferred topology. Our goal in this work is to unify groups of placeholders in the IP-level graph. We introduce a novel clustering algorithm based on semisupervised spectral embedding of all the nodes followed by clustering of the anonymous nodes in the projected space. Experimental results on real internet data are provided, that show good similarity to the true networks.","PeriodicalId":175955,"journal":{"name":"2008 IEEE International Conference on Data Mining Workshops","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128848007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信