2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)最新文献_第3页

Relational data partitioning using evolutionary game theory 使用进化博弈论的关系数据分区

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2014-12-01 DOI: 10.1109/CIDM.2014.7008656

L. Hall, Alireza Chakeri

{"title":"Relational data partitioning using evolutionary game theory","authors":"L. Hall, Alireza Chakeri","doi":"10.1109/CIDM.2014.7008656","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008656","url":null,"abstract":"This paper presents a new approach for relational data partitioning using the notion of dominant sets. A dominant set is a subset of data points satisfying the constraints of internal homogeneity and external in-homogeneity, i.e. a cluster. However, since any subset of a dominant set cannot be a dominant set itself, dominant sets tend to be compact sets. Hence, in this paper, we present a novel approach to enumerate well distributed clusters where the number of clusters need not be known. When the number of clusters is known, in order to search the solution space appropriately, after finding each dominant set, data points are partitioned into two disjoint subsets of data points using spectral graph image segmentation methods to enumerate the other well distributed dominant sets. For the latter case, we introduce a new hierarchical approach for relational data partitioning using a new class of evolutionary game theory dynamics called InImDynamics which is very fast and linear, in computational time, with the number of data points. In this regard, at each level of the proposed hierarchy, Dunn's index is used to find the appropriate number of clusters. Then the objects are partitioned based on the projected number of clusters using game theoretic relations. The same method is applied to each partition to extract its underlying structure. Although the resulting clusters exist in their equivalent partitions, they may not be clusters of the entire data. Hence, they are checked for being an actual cluster and if they are not, they are extended to an existing cluster of the data. The approach can also be used to assign unseen data to existing clusters, as well.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126585720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Classification of iPSC colony images using hierarchical strategies with support vector machines 基于支持向量机分层策略的iPSC群体图像分类

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2014-12-01 DOI: 10.1109/CIDM.2014.7008152

H. Joutsijoki, J. Rasku, Markus Haponen, Ivan Baldin, Y. Gizatdinova, M. Paci, Jyri Saarikoski, Kirsi Varpa, H. Siirtola, Jorge Avalos-Salguero, Kati Iltanen, J. Laurikkala, K. Penttinen, J. Hyttinen, K. Aalto-Setälä, M. Juhola

{"title":"Classification of iPSC colony images using hierarchical strategies with support vector machines","authors":"H. Joutsijoki, J. Rasku, Markus Haponen, Ivan Baldin, Y. Gizatdinova, M. Paci, Jyri Saarikoski, Kirsi Varpa, H. Siirtola, Jorge Avalos-Salguero, Kati Iltanen, J. Laurikkala, K. Penttinen, J. Hyttinen, K. Aalto-Setälä, M. Juhola","doi":"10.1109/CIDM.2014.7008152","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008152","url":null,"abstract":"In this preliminary research we examine the suitability of hierarchical strategies of multi-class support vector machines for classification of induced pluripotent stem cell (iPSC) colony images. The iPSC technology gives incredible possibilities for safe and patient specific drug therapy without any ethical problems. However, growing of iPSCs is a sensitive process and abnormalities may occur during the growing process. These abnormalities need to be recognized and the problem returns to image classification. We have a collection of 80 iPSC colony images where each one of the images is prelabeled by an expert to class bad, good or semigood. We use intensity histograms as features for classification and we evaluate histograms from the whole image and the colony area only having two datasets. We perform two feature reduction procedures for both datasets. In classification we examine how different hierarchical constructions effect the classification. We perform thorough evaluation and the best accuracy was around 54% obtained with the linear kernel function. Between different hierarchical structures, in many cases there are no significant changes in results. As a result, intensity histograms are a good baseline for the classification of iPSC colony images but more sophisticated feature extraction and reduction methods together with other classification methods need to be researched in future.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"14 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124237592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Recommendation for Web services with domain specific context awareness 针对具有特定于域的上下文感知的Web服务的推荐

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2014-12-01 DOI: 10.1109/CIDM.2014.7008679

B. Kumara, Incheon Paik, K. Koswatte, Wuhui Chen

引用次数: 2

Scaling a neyman-pearson subset selection approach via heuristics for mining massive data 基于启发式的内曼-皮尔逊子集选择方法在海量数据挖掘中的扩展

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2014-12-01 DOI: 10.1109/CIDM.2014.7008701

G. Ditzler, M. Austen, G. Rosen, R. Polikar

{"title":"Scaling a neyman-pearson subset selection approach via heuristics for mining massive data","authors":"G. Ditzler, M. Austen, G. Rosen, R. Polikar","doi":"10.1109/CIDM.2014.7008701","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008701","url":null,"abstract":"Feature subset selection is an important step towards producing a classifier that relies only on relevant features, while keeping the computational complexity of the classifier low. Feature selection is also used in making inferences on the importance of attributes, even when classification is not the ultimate goal. For example, in bioinformatics and genomics feature subset selection is used to make inferences between the variables that best describe multiple populations. Unfortunately, many feature selection algorithms require the subset size to be specified a priori, but knowing how many variables to select is typically a nontrivial task. Other approaches rely on a specific variable subset selection framework to be used. In this work, we examine an approach to feature subset selection works with a generic variable selection algorithm, and our approach provides statistical inference on the number of features that are relevant, which may be unknown to the generic variable selection algorithm. This work extends our previous implementation of a Neyman-Pearson feature selection (NPFS) hypothesis test, which acts as a meta-subset selection algorithm. Specifically, we examine the conservativeness of the NPFS approach by biasing the hypothesis test, and examine other heuristics for NPFS. We include results from carefully designed synthetic datasets. Furthermore, we demonstrate the NPFS's ability to perform on data of a massive scale.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125088442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Comparing datasets by attribute alignment 通过属性对齐比较数据集

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2014-12-01 DOI: 10.1109/CIDM.2014.7008148

Jakub Smíd, Roman Neruda

引用次数: 2

Precision-Recall-Optimization in Learning Vector Quantization Classifiers for Improved Medical Classification Systems 改进医学分类系统中学习向量量化分类器的准确率-召回率优化

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2014-12-01 DOI: 10.1109/CIDM.2014.7008150

T. Villmann, M. Kaden, M. Lange, P. Sturmer, W. Hermann

{"title":"Precision-Recall-Optimization in Learning Vector Quantization Classifiers for Improved Medical Classification Systems","authors":"T. Villmann, M. Kaden, M. Lange, P. Sturmer, W. Hermann","doi":"10.1109/CIDM.2014.7008150","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008150","url":null,"abstract":"Classification and decision systems in data analysis are mostly based on accuracy optimization. This criterion is only a conditional informative value if the data are imbalanced or false positive/negative decisions cause different costs. Therefore more sophisticated statistical quality measures are favored in medicine, like precision, recall etc.. Otherwise, most classification approaches in machine learning are designed for accuracy optimization. In this paper we consider variants of learning vector quantizers (LVQs) explicitly optimizing those advanced statistical quality measures while keeping the basic intuitive ingredients of these classifiers, which are the prototype based principle and the Hebbian learning. In particular we focus in this contribution particularly to precision and recall as important measures for use in medical applications. We investigate these problems in terms of precision-recall curves as well as receiver-operating characteristic (ROC) curves well-known in statistical classification and test analysis. With the underlying more general framework, we provide a principled alternatives traditional classifiers, such that a closer connection to statistical classification analysis can be drawn.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133130886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Alzheimer's disease patients classification through EEG signals processing 脑电图信号处理对阿尔茨海默病患者的分类

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2014-12-01 DOI: 10.1109/CIDM.2014.7008655

G. Fiscon, Emanuel Weitschek, G. Felici, P. Bertolazzi, S. D. Salvo, P. Bramanti, M. C. D. Cola

{"title":"Alzheimer's disease patients classification through EEG signals processing","authors":"G. Fiscon, Emanuel Weitschek, G. Felici, P. Bertolazzi, S. D. Salvo, P. Bramanti, M. C. D. Cola","doi":"10.1109/CIDM.2014.7008655","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008655","url":null,"abstract":"Alzheimer's Disease (AD) and its preliminary stage - Mild Cognitive Impairment (MCI) - are the most widespread neurodegenerative disorders, and their investigation remains an open challenge. ElectroEncephalography (EEG) appears as a non-invasive and repeatable technique to diagnose brain abnormalities. Despite technical advances, the analysis of EEG spectra is usually carried out by experts that must manually perform laborious interpretations. Computational methods may lead to a quantitative analysis of these signals and hence to characterize EEG time series. The aim of this work is to achieve an automatic patients classification from the EEG biomedical signals involved in AD and MCI in order to support medical doctors in the right diagnosis formulation. The analysis of the biological EEG signals requires effective and efficient computer science methods to extract relevant information. Data mining, which guides the automated knowledge discovery process, is a natural way to approach EEG data analysis. Specifically, in our work we apply the following analysis steps: (i) pre-processing of EEG data; (ii) processing of the EEG-signals by the application of time-frequency transforms; and (iii) classification by means of machine learning methods. We obtain promising results from the classification of AD, MCI, and control samples that can assist the medical doctors in identifying the pathology.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123394979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 41

Novelty detection applied to the classification problem using Probabilistic Neural Network 新颖性检测应用于概率神经网络的分类问题

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2014-12-01 DOI: 10.1109/CIDM.2014.7008677

Balvant Yadav, V. Devi

{"title":"Novelty detection applied to the classification problem using Probabilistic Neural Network","authors":"Balvant Yadav, V. Devi","doi":"10.1109/CIDM.2014.7008677","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008677","url":null,"abstract":"A novel pattern is an observation which is different as compared to the rest of the data. The task of novelty detection is to build a model which identifies novel patterns from a data set. This model has to be built in such a way that if a pattern is distant from the given training data, it should be classified as a novel pattern otherwise it should be classified into any one of the given classes. In this paper, we present two such new models, based on Probabilistic Neural Network for novelty detection. In the first model, we generate negative examples around the target class data and then train the classifier with these negative examples. In the second model, which is an incremental model, we present a new method to find optimal threshold for each class and if output value for a test pattern being assigned to a target class is less than the threshold of the target class, then we classify that pattern as a novel pattern. We show how decision boundaries are created when we add novelty detection mechanism and when we do not add novelty detection to our model. We show a comparative performance of both approaches.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132332234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Generalized kernel framework for unsupervised spectral methods of dimensionality reduction 无监督谱降维方法的广义核框架

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2014-12-01 DOI: 10.1109/CIDM.2014.7008664

Diego Hernán Peluffo-Ordóñez, J. Lee, M. Verleysen

引用次数: 21

Recognizing gym exercises using acceleration data from wearable sensors 使用来自可穿戴传感器的加速度数据来识别健身房的运动

2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) Pub Date : 2014-12-01 DOI: 10.1109/CIDM.2014.7008685

Heli Koskimäki, Pekka Siirtola

{"title":"Recognizing gym exercises using acceleration data from wearable sensors","authors":"Heli Koskimäki, Pekka Siirtola","doi":"10.1109/CIDM.2014.7008685","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008685","url":null,"abstract":"The activity recognition approaches can be used for entertainment, to give people information about their own behavior, and to monitor and supervise people through their actions. Thus, it is a natural consequence of that fact that the amount of wearable sensors based studies has increased as well, and new applications of activity recognition are being invented in the process. In this study, gym data, including 36 different exercise classes, is used aiming in the future to create automatic activity diaries showing reliably to end users how many sets of given exercise have been performed. The actual recognition is divided into two different steps. In the first step, activity recognition of certain time intervals is performed and in the second step the state-machine approach is used to decide when actual events (sets in gym data) were performed. The results showed that when recognizing different exercise sets from the same occasion (sequential exercise sets), on average, over 96 percent window-wise true positive rate can be achieved, and moreover, all the exercise events can be discovered using the state-machine approach. When using a separate validation test set, the accuracies decreased significantly for some classes, but even in this case, all the different sets were discovered for 26 different classes.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124899551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25