{"title":"Facial image clustering in stereo videos using local binary patterns and double spectral analysis","authors":"G. Orfanidis, A. Tefas, N. Nikolaidis, I. Pitas","doi":"10.1109/CIDM.2014.7008670","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008670","url":null,"abstract":"In this work we propose the use of local binary patterns in combination with double spectral analysis for facial image clustering applied to 3D (stereoscopic) videos. Double spectral clustering involves the fusion of two well known algorithms: Normalized cuts and spectral clustering in order to improve the clustering performance. The use of local binary patterns upon selected fiducial points on the facial images proved to be a good choice for describing images. The framework is applied on 3D videos and makes use of the additional information deriving from the existence of two channels, left and right for further improving the clustering results.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131713806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anil Kumar, Nitesh Kumar, M. Hussain, S. Chaudhury, Sumeet Agarwal
{"title":"Semantic clustering-based cross-domain recommendation","authors":"Anil Kumar, Nitesh Kumar, M. Hussain, S. Chaudhury, Sumeet Agarwal","doi":"10.1109/CIDM.2014.7008659","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008659","url":null,"abstract":"Cross-domain recommendation systems exploit tags, textual descriptions or ratings available for items in one domain to recommend items in multiple domains. Handling unstructured/ unannotated item information is, however, a challenge. Topic modeling offer a popular method for deducing structure in such data corpora. In this paper, we introduce the concept of a common latent semantic space, spanning multiple domains, using topic modeling of semantic clustered vocabularies of distinct domains. The intuition here is to use explicitly-determined semantic relationships between non-identical, but possibly semantically equivalent, words in multiple domain vocabularies, in order to capture relationships across information obtained in distinct domains. The popular WordNet based ontology is used to measure semantic relatedness between textual words. The experimental results shows that there is a marked improvement in the precision of predicting user preferences for items in one domain when given the preferences in another domain.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123915954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexandre Pimenta, F. Guimarães, E. G. Carrano, C. Nametala, R. Takahashi
{"title":"GoldMiner: A genetic programming based algorithm applied to Brazilian Stock Market","authors":"Alexandre Pimenta, F. Guimarães, E. G. Carrano, C. Nametala, R. Takahashi","doi":"10.1109/CIDM.2014.7008695","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008695","url":null,"abstract":"The possibility of obtaining financial gain by investing in the Stock Markets is a hard task since it is under constant influence of economical, political and social factors. This paper aims to address the financial technical analysis of Stock Markets, focusing on time series data instead of subjective parameters. An algorithm based on genetic programming, named GoldMiner, has been proposed to perform retrospective study in order to get predictions about the best time for trading top stocks on the BOVESPA, the Brazilian stock exchange market.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"256 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116014642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wolf search algorithm for attribute reduction in classification","authors":"Waleed Yamany, E. Emary, A. Hassanien","doi":"10.1109/CIDM.2014.7008689","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008689","url":null,"abstract":"Data sets ordinarily includes a huge number of attributes, with irrelevant and redundant attributes. Redundant and irrelevant attributes might minimize the classification accuracy because of the huge search space. The main goal of attribute reduction is choose a subset of relevant attributes from a huge number of available attributes to obtain comparable or even better classification accuracy than using all attributes. A system for feature selection is proposed in this paper using a modified version of the wolf search algorithm optimization. WSA is a bio-inspired heuristic optimization algorithm that imitates the way wolves search for food and survive by avoiding their enemies. The WSA can quickly search the feature space for optimal or near-optimal feature subset minimizing a given fitness function. The proposed fitness function used incorporate both classification accuracy and feature reduction size. The proposed system is applied on a set of the UCI machine learning data sets and proves good performance in comparison with the GA and PSO optimizers commonly used in this context.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129822245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"kNN estimation of the unilateral dependency measure between random variables","authors":"A. Cataron, Răzvan Andonie, Y. Chueh","doi":"10.1109/CIDM.2014.7008705","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008705","url":null,"abstract":"The informational energy (IE) can be interpreted as a measure of average certainty. In previous work, we have introduced a non-parametric asymptotically unbiased and consistent estimator of the IE. Our method was based on the kth nearest neighbor (kNN) method, and it can be applied to both continuous and discrete spaces, meaning that we can use it both in classification and regression algorithms. Based on the IE, we have introduced a unilateral dependency measure between random variables. In the present paper, we show how to estimate this unilateral dependency measure from an available sample set of discrete or continuous variables, using the kNN and the naïve histogram estimators. We experimentally compare the two estimators. Then, in a real-world application, we apply the kNN and the histogram estimators to approximate the unilateral dependency between random variables which describe the temperatures of sensors placed in a refrigerating room.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"446 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114958745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benoît Frénay, Daniela Hofmann, Alexander Schulz, Michael Biehl, B. Hammer
{"title":"Valid interpretation of feature relevance for linear data mappings","authors":"Benoît Frénay, Daniela Hofmann, Alexander Schulz, Michael Biehl, B. Hammer","doi":"10.1109/CIDM.2014.7008661","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008661","url":null,"abstract":"Linear data transformations constitute essential operations in various machine learning algorithms, ranging from linear regression up to adaptive metric transformation. Often, linear scalings are not only used to improve the model accuracy, rather feature coefficients as provided by the mapping are interpreted as an indicator for the relevance of the feature for the task at hand. This principle, however, can be misleading in particular for high-dimensional or correlated features, since it easily marks irrelevant features as relevant or vice versa. In this contribution, we propose a mathematical formalisation of the minimum and maximum feature relevance for a given linear transformation which can efficiently be solved by means of linear programming. We evaluate the method in several benchmarks, where it becomes apparent that the minimum and maximum relevance closely resembles what is often referred to as weak and strong relevance of the features; hence unlike the mere scaling provided by the linear mapping, it ensures valid interpretability.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129853310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Hiroyasu, T. Shiraishi, Tomoya Yoshida, U. Yamamoto
{"title":"A feature transformation method using genetic programming for two-class classification","authors":"T. Hiroyasu, T. Shiraishi, Tomoya Yoshida, U. Yamamoto","doi":"10.1109/CIDM.2014.7008673","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008673","url":null,"abstract":"In this paper, a feature transformation method for two-class classification using genetic programming (GP) is proposed. GP derives a transformation formula to improve the classification accuracy of Support Vector Machine, SVM. In this paper, we propose a weight function to evaluate converted feature space and the proposed function is used to evaluate the function of GP. In the proposed function, the ideal two-class distribution of items is assumed and the distance between the actual and ideal distributions is calculated. The weight is imposed to these distances. To examine the effectiveness of the proposed function, a numerical experiment was performed. In the experiment, as the result, the classification accuracy of the proposed method showed the better result than that of the existing method.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"25 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130716693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dependency network methods for Hierarchical Multi-label Classification of gene functions","authors":"F. Fabris, A. Freitas","doi":"10.1109/CIDM.2014.7008674","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008674","url":null,"abstract":"Hierarchical Multi-label Classification (HMC) is a challenging real-world problem that naturally emerges in several areas. This work proposes two new algorithms using a Probabilistic Graphical Model based on Dependency Networks (DN) to solve the HMC problem of classifying gene functions into pre-established class hierarchies. DNs are especially attractive for their capability of using traditional, “out-of-the-shelf”, classification algorithms to model the relationship among classes and for their ability to cope with cyclic dependencies, resulting in greater flexibility with respect to Bayesian Networks. We tested our two algorithms: the first is a stand-alone Hierarchical Dependency Network (HDN) algorithm, and the second is a hybrid between the HDN and the Predictive Clustering Tree (PCT) algorithm, a well-known classifier for HMC. Based on our experiments, the hybrid classifier, using SVMs as base classifiers, obtained higher predictive accuracy than both the standard PCT algorithm and the HDN algorithm, considering 22 bioinformatics datasets and two out of three predictive accuracy measures specific for hierarchical classification (AU(PRC) and AUPRCw).","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"27 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134426894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Incremental transfer RULES with incomplete data","authors":"H. Elgibreen, M. Aksoy","doi":"10.1109/CIDM.2014.7008676","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008676","url":null,"abstract":"Recently strong AI emerged from artificial intelligence due to need for a thinking machine. In this domain, it is necessary to deal with dynamic incomplete data and understanding of how machines make their decision is also important, especially in information system domain. One type of learning called Covering Algorithms (CA) can be used instead of the difficult statistical machine learning methods to produce simple rule with powerful prediction ability. However, although using CA as the base of strong AI is a novel idea, doing so with the current methods available is not possible. Thus, this paper presents a novel CA (RULES-IT) and tests its performance over incomplete data. This algorithm is the first incremental algorithm in its family, and CA as a whole, that transfer rules from different domains and introduce intelligent aspects using simple representation. The performance of RULES-IT will be tested over incomplete and complete data along with other algorithms in the literature. It will be validated using 5-fold cross validation in addition to Friedman with Nemenyi post hoc tests to measure the significance and rank the algorithms.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124077675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A framework for initialising a dynamic clustering algorithm: ART2-A","authors":"Simon J. Chambers, I. Jarman, P. Lisboa","doi":"10.1109/CIDM.2014.7008678","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008678","url":null,"abstract":"Algorithms in the Adaptive Resonance Theory (ART) family adapt to structural changes in data as new information presents, making it an exciting candidate for dynamic online clustering of big health data. Its use however has largely been restricted to the signal processing field. In this paper we introduce an refinement of the ART2-A method within an adapted separation and concordance (SeCo) framework which has been shown to identify stable and reproducible solutions from repeated initialisations that also provides evidence for an appropriate number of initial clusters that best calibrates the algorithm with the data presented. The results show stable, reproducible solutions for a mix of real-world heath related datasets and well known benchmark datasets, selecting solutions which better represent the underlying structure of the data than using a single measure of separation. The scalability of the method and it's facility for dynamic online clustering makes it suitable for finding structure in big data.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122405791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}