S. Ortega-Martorell, I. Olier, M. Julià-Sapé, C. Arús, P. Lisboa
{"title":"Automatic relevance source determination in human brain tumors using Bayesian NMF","authors":"S. Ortega-Martorell, I. Olier, M. Julià-Sapé, C. Arús, P. Lisboa","doi":"10.1109/CIDM.2014.7008654","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008654","url":null,"abstract":"The clinical management of brain tumors is very sensitive; thus, their non-invasive characterization is often preferred. Non-negative Matrix Factorization techniques have been successfully applied in the context of neuro-oncology to extract the underlying source signals that explain different tissue tumor types, for which knowing the number of sources to calculate was always required. In the current study we estimate the number of relevant sources for a set of discrimination problems involving brain tumors and normal brain. For this, we propose to start by calculating a high number of sources using Bayesian NMF and automatically discarding the irrelevant ones during the iterative process of matrices decomposition, hence obtaining a reduced range of interpretable solutions. The real data used in this study come from a widely tested human brain tumor database. Simulated data that resembled the real data was also generated to validate the hypothesis against ground truth. The results obtained suggest that the proposed approach is able to provide a small range of meaningful solutions to the problem of source extraction in human brain tumors.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128056418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Erin-Elizabeth A. Durham, Andrew Rosen, R. Harrison
{"title":"Optimization of relational database usage involving Big Data a model architecture for Big Data applications","authors":"Erin-Elizabeth A. Durham, Andrew Rosen, R. Harrison","doi":"10.1109/CIDM.2014.7008703","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008703","url":null,"abstract":"Effective Big Data applications dynamically handle the retrieval of decisioned results based on stored large datasets efficiently. One effective method of requesting decisioned results, or querying, large datasets is the use of SQL and database management systems such as MySQL. But a problem with using relational databases to store huge datasets is the decisioned result retrieval time, which is often slow largely due to poorly written queries/decision requests. This work presents a model to re-architect Big Data applications in order to efficiently present decisioned results: lowering the volume of data being handled by the application itself, and significantly decreasing response wait times while allowing the flexibility and permanence of a standard relational SQL database, supplying optimal user satisfaction in today's Data Analytics world. We experimentally demonstrate the effectiveness of our approach.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116966148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A sparsity-based training algorithm for Least Squares SVM","authors":"Jie Yang, Jun Ma","doi":"10.1109/CIDM.2014.7008688","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008688","url":null,"abstract":"We address the training problem of the sparse Least Squares Support Vector Machines (SVM) using compressed sensing. The proposed algorithm regards the support vectors as a dictionary and selects the important ones that minimize the residual output error iteratively. A measurement matrix is also introduced to reduce the computational cost. The main advantage is that the proposed algorithm performs model training and support vector selection simultaneously. The performance of the proposed algorithm is tested with several benchmark classification problems in terms of number of selected support vectors and size of the measurement matrix. Simulation results show that the proposed algorithm performs competitively when compared to existing methods.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123550076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tibetan-Chinese cross language named entity extraction based on comparable corpus and naturally annotated resources","authors":"Yuan Sun, W. Guo, Xiaobing Zhao","doi":"10.1109/CIDM.2014.7008680","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008680","url":null,"abstract":"Tibetan-Chinese named entity extraction can effectively improve the performance of Tibetan-Chinese cross language question answering system, information retrieval, machine translation and other researches. In the condition of no practical Tibetan named entity recognition system and Tibetan-Chinese translation model, this paper proposes a method to extract Tibetan-Chinese entities based on comparable corpus and naturally annotated resources from webs. The main work of this paper is in the following: (1) Tibetan-Chinese comparable corpus construction. (2) Combining sentence length, word matching and boundary term features, using multi-feature fusion algorithm to obtain parallel sentences from comparable corpus. (3) Tibetan-Chinese entity mapping based on the maximum word continuous intersection model of parallel sentence. Finally, the experimental results show that our approach can effectively find Tibetan-Chinese cross language named entity.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"236 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121995550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantum clustering — A novel method for text analysis","authors":"Ding Liu, Minghu Jiang, Xiaofang Yang","doi":"10.1109/CIDM.2014.7008143","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008143","url":null,"abstract":"The article introduces quantum clustering inspired from the quantum mechanics and extended to text analysis. This novel method upgrades the nonparametric density estimation and, different from the latter, quantum clustering constructs the potential function to determine the cluster center instead of the Gaussian kernel function. The result of a comparative experiment proves the advantage of quantum clustering over the conventional Parzen-window, and the further trial on authorship identification illustrates the wide application scope of this novel method.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126686453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clustering data over time using kernel spectral clustering with memory","authors":"R. Langone, Raghvendra Mall, J. Suykens","doi":"10.1109/CIDM.2014.7008141","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008141","url":null,"abstract":"This paper discusses the problem of clustering data changing over time, a research domain that is attracting increasing attention due to the increased availability of streaming data in the Web 2.0 era. In the analysis conducted throughout the paper we make use of the kernel spectral clustering with memory (MKSC) algorithm, which is developed in a constrained optimization setting. Since the objective function of the MKSC model is designed to explicitly incorporate temporal smoothness, the algorithm belongs to the family of evolutionary clustering methods. Experiments over a number of real and synthetic datasets provide very interesting insights in the dynamics of the clusters evolution. Specifically, MKSC is able to handle objects leaving and entering over time, and recognize events like continuing, shrinking, growing, splitting, merging, dissolving and forming of clusters. Moreover, we discover how one of the regularization constants of the MKSC model, referred as the smoothness parameter, can be used as a change indicator measure. Finally, some possible visualizations of the cluster dynamics are proposed.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131201591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Matching social network biometrics using geo-analytical behavioral modeling","authors":"M. Rahmes, K. Fox, J. Delay, Gran Roe","doi":"10.1109/CIDM.2014.7008699","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008699","url":null,"abstract":"Social patterns and graphical representation of geospatial activity is important for describing a person's typical behavior. We discuss a framework using social media and GPS smart phone to track an individual and establish normal activity with a network biometric. An individual's daily routine may include visiting many locations - home, work, shopping, entertainment and other destinations. All of these activities pose a routine or status quo of expected behavior. What has always been difficult, however, is predicting a change to the status quo, or predicting unusual behavior. We propose taking the knowledge of location information over a relatively long period of time and marrying that with modern analytical capabilities. The result is a biometric that can be fused and correlated with another's behavioral biometric to determine relationships. Our solution is based on the analytical environment to support the ingestion of many data sources and the integration of analytical algorithms such as feature extraction, crowd source analysis, open source data mining, trends, pattern analysis and linear game theory optimization. Our framework consists of a hierarchy of data, space, time, and knowledge entities. We exploit such statistics to predict behavior or activity based on past observations. We use multivariate mutual information as a measure to compare behavioral biometrics.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132829842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Lei, J. Bezdek, Jeffrey Chan, X. Nguyen, Simone Romano, J. Bailey
{"title":"Generalized information theoretic cluster validity indices for soft clusterings","authors":"Yang Lei, J. Bezdek, Jeffrey Chan, X. Nguyen, Simone Romano, J. Bailey","doi":"10.1109/CIDM.2014.7008144","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008144","url":null,"abstract":"There have been a large number of external validity indices proposed for cluster validity. One such class of cluster comparison indices is the information theoretic measures, due to their strong mathematical foundation and their ability to detect non-linear relationships. However, they are devised for evaluating crisp (hard) partitions. In this paper, we generalize eight information theoretic crisp indices to soft clusterings, so that they can be used with partitions of any type (i.e., crisp or soft, with soft including fuzzy, probabilistic and possibilistic cases). We present experimental results to demonstrate the effectiveness of the generalized information theoretic indices.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115753262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ontology learning with complex data type for Web service clustering","authors":"B. Kumara, Incheon Paik, K. Koswatte, Wuhui Chen","doi":"10.1109/CIDM.2014.7008658","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008658","url":null,"abstract":"Clustering Web services into functionally similar clusters is a very efficient approach to service discovery. A principal issue for clustering is computing the semantic similarity between services. Current approaches use similarity-distance measurement methods such as keyword, information-retrieval or ontology based methods. These approaches have problems that include discovering semantic characteristics, loss of semantic information and a shortage of high-quality ontologies. Further, current clustering approaches are considered only have simple data types in services' input and output. However, services that published on the web have input/ output parameter of complex data type. In this research, we propose clustering approach that considers the simple type as well as complex data type in measuring the service similarity. We use hybrid term similarity method which we proposed in our previous work to measure the similarity. We capture the semantic pattern exist in complex data types and simple data types to improve the ontology learning method. Experimental results show our clustering approach which uses complex data types in measuring similarity works efficiently.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129889511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"New bilinear formulation to semi-supervised classification based on Kernel Spectral Clustering","authors":"V. Jumutc, J. Suykens","doi":"10.1109/CIDM.2014.7008146","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008146","url":null,"abstract":"In this paper we present a novel semi-supervised classification approach which combines bilinear formulation for non-parallel binary classifiers based upon Kernel Spectral Clustering. The cornerstone of our approach is a bilinear term introduced into the primal formulation of semi-supervised classification problem. In addition we perform separate manifold regularization for each individual classifier. The latter relates to the Kernel Spectral Clustering unsupervised counterpart which helps to obtain more precise and generalizable classification boundaries. We derive the dual problem which can be effectively translated into a linear system of equations and then solved without introducing extra costs. In our experiments we show the usefulness and report considerable improvements in performance with respect to other semi-supervised approaches, like Laplacian SVMs and other KSC-based models.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122859181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}