2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)最新文献

筛选
英文 中文
IQ estimation for accurate time-series classification 用于精确时间序列分类的IQ估计
2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949441
Krisztián Búza, A. Nanopoulos, L. Schmidt-Thieme
{"title":"IQ estimation for accurate time-series classification","authors":"Krisztián Búza, A. Nanopoulos, L. Schmidt-Thieme","doi":"10.1109/CIDM.2011.5949441","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949441","url":null,"abstract":"Due to its various applications, time-series classification is a prominent research topic in data mining and computational intelligence. The simple k-NN classifier using dynamic time warping (DTW) distance had been shown to be competitive to other state-of-the art time-series classifiers. In our research, however, we observed that a single fixed choice for the number of nearest neighbors k may lead to suboptimal performance. This is due to the complexity of time-series data, especially because the characteristic of the data may vary from region to region. Therefore, local adaptations of the classification algorithm is required. In order to address this problem in a principled way by, in this paper we introduce individual quality (IQ) estimation. This refers to estimating the expected classification accuracy for each time series and each k individually. Based on the IQ estimations we combine the classification results of several k-NN classifiers as final prediction. In our framework of IQ, we develop two time-series classification algorithms, IQ-MAX and IQ-WV. In our experiments on 35 commonly used benchmark data sets, we show that both IQ-MAX and IQ-WV outperform two baselines.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132110255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
On the use of decision trees for ICU outcome prediction in sepsis patients treated with statins 决策树在他汀类药物治疗脓毒症患者ICU预后预测中的应用
2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949439
V. Ribas, J. Lopez, J. Ruiz-Rodríguez, Adolf Ruiz-Sanmartin, J. Rello, A. Vellido
{"title":"On the use of decision trees for ICU outcome prediction in sepsis patients treated with statins","authors":"V. Ribas, J. Lopez, J. Ruiz-Rodríguez, Adolf Ruiz-Sanmartin, J. Rello, A. Vellido","doi":"10.1109/CIDM.2011.5949439","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949439","url":null,"abstract":"Sepsis is one of the main causes of death for noncoronary ICU (Intensive Care Unit) patients and has become the tenth most common cause of death in western societies. This is a transversal condition affecting immunocompromised patients, critically ill patients, post-surgery patients, patients with AIDS, and the elderly. In western countries, septic patients account for as much as 25% of ICU bed utilization and the pathology affects 1% – 2% of all hospitalizations. Its mortality rates range from 12.8% for sepsis to 45.7% for septic shock. Early administration of antibiotics is known to be crucial for ICU outcomes. In this regard, statins, a class of drug, have been shown to present good anti-inflammatory properties beyond their regulation of the biosynthesis of cholesterol. In this brief paper, we hypothesize that preadmission use of statins improves ICU outcomes. We test this hypothesis in a prospective study in patients admitted with severe sepsis and multiorgan failure at the ICU of Vall d' Hebron University Hospital (Barcelona, Spain), using statistic algebraic models and regression trees.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120947973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
About the analysis of time series with temporal association rule mining 关于时间序列分析的时间关联规则挖掘
2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949303
Tim Schlüter, Stefan Conrad
{"title":"About the analysis of time series with temporal association rule mining","authors":"Tim Schlüter, Stefan Conrad","doi":"10.1109/CIDM.2011.5949303","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949303","url":null,"abstract":"This paper addresses the issue of analyzing time series with temporal association rule mining techniques. Since originally association rule mining was developed for the analysis of transactional data, as it occurs for instance in market basket analysis, algorithms and time series have to be adapted in order to apply these techniques gainfully to the analysis of time series in general. Continuous time series of different origins can be discretized in order to mine several temporal association rules, what reveals interesting coherences in one and between pairs of time series. Depending on the domain, the knowledge about these coherences can be used for several purposes, e.g. for the prediction of future values of time series. We present a short review on different standard and temporal association rule mining approaches and on approaches that apply association rule mining to time series analysis. In addition to that, we explain in detail how some of the most interesting kinds of temporal association rules can be mined from continuous time series and present an prototype implementation. We demonstrate and evaluate our implementation on two large datasets containing river level measurement and stock data.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131335548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
A framework for semi-automated process instance discovery from decorative attributes 用于从装饰性属性发现半自动化流程实例的框架
2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949450
Andrea Burattin, R. Vigo
{"title":"A framework for semi-automated process instance discovery from decorative attributes","authors":"Andrea Burattin, R. Vigo","doi":"10.1109/CIDM.2011.5949450","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949450","url":null,"abstract":"Process mining is a relatively new field of research: its final aim is to bridge the gap between data mining and business process modelling. In particular, the assumption underpinning this discipline is the availability of data coming from business process executions. In business process theory, once the process has been defined, it is possible to have a number of instances of the process running at the same time. Usually, the identification of different instances is referred to a specific “case id” field in the log exploited by process mining techniques. The software systems that support the execution of a business process, however, often do not record explicitly such information. This paper presents an approach that faces the absence of the “case id” information: we have a set of extra fields, decorating each single activity log, that are known to carry the information on the process instance. A framework is addressed, based on simple relational algebra notions, to extract the most promising case ids from the extra fields. The work is a generalization of a real business case.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122461430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Geodesic distances for web document clustering web文档聚类的测地线距离
2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949449
Selma Tekir, Florian Mansmann, D. Keim
{"title":"Geodesic distances for web document clustering","authors":"Selma Tekir, Florian Mansmann, D. Keim","doi":"10.1109/CIDM.2011.5949449","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949449","url":null,"abstract":"While traditional distance measures are often capable of properly describing similarity between objects, in some application areas there is still potential to fine-tune these measures with additional information provided in the data sets. In this work we combine such traditional distance measures for document analysis with link information between documents to improve clustering results. In particular, we test the effectiveness of geodesic distances as similarity measures under the space assumption of spherical geometry in a 0-sphere. Our proposed distance measure is thus a combination of the cosine distance of the term-document matrix and some curvature values in the geodesic distance formula. To estimate these curvature values, we calculate clustering coefficient values for every document from the link graph of the data set and increase their distinctiveness by means of a heuristic as these clustering coefficient values are rough estimates of the curvatures. To evaluate our work, we perform clustering tests with the k-means algorithm on the English Wikipedia hyperlinked data set with both traditional cosine distance and our proposed geodesic distance. The effectiveness of our approach is measured by computing micro-precision values of the clusters based on the provided categorical information of each article.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124873197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Logistic sub-models for small size populations in credit scoring 信用评分中小群体Logistic子模型
2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2011-04-11 DOI: 10.1109/CIDM.2011.5949425
Bouaguel Waad, F. Beninel, G. B. Mufti
{"title":"Logistic sub-models for small size populations in credit scoring","authors":"Bouaguel Waad, F. Beninel, G. B. Mufti","doi":"10.1109/CIDM.2011.5949425","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949425","url":null,"abstract":"The credit scoring risk management is a fast growing field due to consumer's credit requests. Credit requests, of new and existing customers, are often evaluated by classical discrimination rules based on customers information. However, these kinds of strategies have serious limits and don't take into account the characteristics difference between current customers and the future ones. The aim of this paper is to measure credit worthiness for non customers borrowers and to model potential risk given a heterogeneous population formed by borrowers customers of the bank and others who are not. We hold on previous works done in generalized discrimination and transpose them into the logistic model to bring out efficient discrimination rules for non customers' subpopulation. Therefore we obtain seven simple models of connection between parameters of both logistic models associated respectively to the two subpopulations. The German credit data set is selected as the experimental data to compare the seven models. Experimental results show that the use of links between the two subpopulations improve the classification accuracy for the new loan applicants.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128005799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Opening black box Data Mining models using Sensitivity Analysis 利用敏感性分析打开黑匣子数据挖掘模型
2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2011-04-01 DOI: 10.1109/CIDM.2011.5949423
P. Cortez, M. Embrechts
{"title":"Opening black box Data Mining models using Sensitivity Analysis","authors":"P. Cortez, M. Embrechts","doi":"10.1109/CIDM.2011.5949423","DOIUrl":"https://doi.org/10.1109/CIDM.2011.5949423","url":null,"abstract":"There are several supervised learning Data Mining (DM) methods, such as Neural Networks (NN), Support Vector Machines (SVM) and ensembles, that often attain high quality predictions, although the obtained models are difficult to interpret by humans. In this paper, we open these black box DM models by using a novel visualization approach that is based on a Sensitivity Analysis (SA) method. In particular, we propose a Global SA (GSA), which extends the applicability of previous SA methods (e.g. to classification tasks), and several visualization techniques (e.g. variable effect characteristic curve), for assessing input relevance and effects on the model's responses. We show the GSA capabilities by conducting several experiments, using a NN ensemble and SVM model, in both synthetic and real-world datasets.","PeriodicalId":211565,"journal":{"name":"2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115835704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 99
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信