{"title":"Discovering Knowledge Rules with Multi-Objective Evolutionary Computing","authors":"Rafael Giusti, Gustavo E. A. P. A. Batista","doi":"10.1109/ICMLA.2010.25","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.25","url":null,"abstract":"Most Machine Learning systems target into inducing classifiers with optimal coverage and precision measures. Although this constitutes a good approach for prediction, it might not provide good results when the user is more interested in description. In this case, the induced models should present other properties such as novelty, interestingness and so forth. In this paper we present a research work based in Multi-Objective Evolutionary Computing to construct individual knowledge rules targeting arbitrary user-defined criteria via objective quality measures such as precision, support, novelty etc. This paper also presents a comparison among multi-objective and ranking composition techniques. It is shown that multi-objective-based methods attain better results than ranking-based methods, both in terms of solution dominance and diversity of solutions in the Pareto front.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131372601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Smoothing Gene Expression Using Biological Networks","authors":"Yue Fan, M. Kon, Shinuk Kim, C. DeLisi","doi":"10.1109/ICMLA.2010.85","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.85","url":null,"abstract":"Gene expression (micro array) data have been used widely in bioinformatics. The expression data of a large number of genes from small numbers of subjects are used to identify informative biomarkers that may predict or help in diagnosing some disorders. More recently, increasing amounts of information from underlying relationships of the expressed genes have become available, and workers have started to investigate algorithms which can use such a priori information to improve classification or regression based on gene expression. In this paper, we describe three novel machine learning algorithms for regularizing (smoothing) micro array expression values defined on gene sets with known prior network or metric structures, and which exploit this gene interaction information. These regularized expression values can be used with any machine classifier with the goal of better classification. In this paper, standard smoothing (denoising) techniques previously developed for functions on Euclidean spaces are extended to allow smoothing of micro array expression feature vectors using distance measures defined by biological networks. Such a priori smoothing (denoising) of the feature vectors using metrics on the index space (here the space of genes) yields better signal to noise ratios in the data. When tested on two breast cancer datasets, support vector machine classifiers trained on the smoothed expression values obtain better areas under ROC curves in two cancer datasets.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125304600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shayok Chakraborty, V. Balasubramanian, S. Panchanathan
{"title":"Dynamic Batch Size Selection for Batch Mode Active Learning in Biometrics","authors":"Shayok Chakraborty, V. Balasubramanian, S. Panchanathan","doi":"10.1109/ICMLA.2010.10","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.10","url":null,"abstract":"Robust biometric recognition is of paramount importance in security and surveillance applications. In face based biometric systems, data is usually collected using a video camera with high frame rate and thus the captured data has high redundancy. Selecting the appropriate instances from this data to update a classification model, is a significant, yet valuable challenge. Active learning methods have gained popularity in identifying the salient and exemplar data instances from superfluous sets. Batch mode active learning schemes attempt to select a batch of samples simultaneously rather than updating the model after selecting every single data point. Existing work on batch mode active learning assume a fixed batch size, which is not a practical assumption in biometric recognition applications. In this paper, we propose a novel framework to dynamically select the batch size using clustering based unsupervised learning techniques. We also present a batch mode active learning strategy specially suited to handle the high redundancy in biometric datasets. The results obtained on the challenging VidTIMIT and MOBIO datasets corroborate the superiority of dynamic batch size selection over static batch size and also certify the potential of the proposed active learning scheme in being used for real world biometric recognition applications.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126589556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frank Sehnke, Alex Graves, Christian Osendorfer, J. Schmidhuber
{"title":"Multimodal Parameter-exploring Policy Gradients","authors":"Frank Sehnke, Alex Graves, Christian Osendorfer, J. Schmidhuber","doi":"10.1109/ICMLA.2010.24","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.24","url":null,"abstract":"Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estimates encountered in normal policy gradient methods. It has been shown to drastically speed up convergence for several large-scale reinforcement learning tasks. However the independent normal distributions used by PGPE to search through parameter space are inadequate for some problems with multimodal reward surfaces. This paper extends the basic PGPE algorithm to use multimodal mixture distributions for each parameter, while remaining efficient. Experimental results on the Rastrigin function and the inverted pendulum benchmark demonstrate the advantages of this modification, with faster convergence to better optima.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116764486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Validating Meronymy Hypotheses with Support Vector Machines and Graph Kernels","authors":"Tim vor der Brück, H. Helbig","doi":"10.1109/ICMLA.2010.43","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.43","url":null,"abstract":"There is a substantial body of work on the extraction of relations from texts, most of which is based on pattern matching or on applying tree kernel functions to syntactic structures. Whereas pattern application is usually more efficient, tree kernels can be superior when assessed by the F-measure. In this paper, we introduce a hybrid approach to extracting meronymy relations, which is based on both patterns and kernel functions. In a first step, meronymy relation hypotheses are extracted from a text corpus by applying patterns. In a second step these relation hypotheses are validated by using several shallow features and a graph kernel approach. In contrast to other meronymy extraction and validation methods which are based on surface or syntactic representations we use a purely semantic approach based on semantic networks. This involves analyzing each sentence of the Wikipedia corpus by a deep syntactico-semantic parser and converting it into a semantic network. Meronymy relation hypotheses are extracted from the semantic networks by means of an automated theorem prover, which employs a set of logical axioms and patterns in the form of semantic networks. The meronymy candidates are then validated by means of a graph kernel approach based on common walks. The evaluation shows that this method achieves considerably higher accuracy, recall, and F-measure than a method using purely shallow validation.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123300367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic Segmentation of the Prostate Using a Genetic Algorithm for Prostate Cancer Treatment Planning","authors":"Melanie Mitchell, J. Tanyi, A. Hung","doi":"10.1109/ICMLA.2010.115","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.115","url":null,"abstract":"This paper presents a genetic algorithm (GA) for combining representations of learned priors such as shape, regional properties and relative location of organs into a single framework in order to perform automated segmentation of the prostate. Prostate segmentation is typically performed manually by an expert physician and is used to determine the locations for radioactive seed placement during radiotherapy treatment planning. The GA accounts for the uncertainty in the definitions of tumor margins by combining known representations of shape, texture and relative location of organs to perform automatic segmentation in two (2D) as well as three dimensions (3D).","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129615961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Vashist, S. Mau, A. Poylisher, R. Chadha, Abhrajit Ghosh
{"title":"Predicting End-to-end Network Load","authors":"A. Vashist, S. Mau, A. Poylisher, R. Chadha, Abhrajit Ghosh","doi":"10.1109/ICMLA.2010.145","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.145","url":null,"abstract":"Due to their limited and fluctuating bandwidth, mobile ad hoc networks (MANETs) are inherently resource-constrained. As traffic load increases, we need to decide when to throttle the traffic to maximize user satisfaction while keeping the network operational. The state-of-the-art for making these decisions is based on network measurements and so employs a reactive approach to deteriorating network state by reducing the amount of traffic admitted into the network. However, a better approach is to avoid congestion before it occurs by predicting future network traffic using user and application information from the overlaying social network. We use machine learning methods to predict the source and destination of near future traffic load.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128863266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Chunking Method for Euclidean Distance Matrix Calculation on Large Dataset Using Multi-GPU","authors":"Qi Li, V. Kecman, R. Salman","doi":"10.1109/ICMLA.2010.38","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.38","url":null,"abstract":"Calculating Euclidean distance matrix is a data intensive operation and becomes computationally prohibitive for large datasets. Recent development of Graphics Processing Units (GPUs) has produced superb performance on scientific computing problems using massive parallel processing cores. However, due to the limited size of device memory, many GPU based algorithms have low capability in solving problems with large datasets. In this paper, a chunking method is proposed to calculate Euclidean distance matrix on large datasets. This is not only designed for scalability in multi-GPU environment but also to maximize the computational capability of each individual GPU device. We first implement a fast GPU algorithm that is suitable for calculating sub matrices of Euclidean distance matrix. Then we utilize a Map-Reduce like framework to split the final distance matrix calculation into many small independent jobs of calculating partial distance matrices, which can be efficiently solved by our GPU algorithm. The framework also dynamically allocates GPU resources to those independent jobs for maximum performance. The experimental results have shown a speed up of 15x on datasets which contain more than half million data points.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128925188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Relative Tendency Based Stock Market Prediction System","authors":"ManChon U, K. Rasheed","doi":"10.1109/ICMLA.2010.151","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.151","url":null,"abstract":"Researchers have known for some time that non-linearity exists in the financial markets and that neural networks can be used to forecast market returns. In this article, we present a novel stock market prediction system which focuses on forecasting the relative tendency growth between different stocks and indices rather than purely predicting their values. This research utilizes artificial neural network models for estimation. The results are examined for their ability to provide an effective forecast of future values. Certain techniques, such as sliding windows and chaos theory, are employed for data preparation and pre-processing. Our system successfully predicted the relative tendency growth of different stocks with up to 99.01% accuracy.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"04 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129983862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Constrained Nonnegative Tensor Factorization for Clustering","authors":"Wei Peng","doi":"10.1109/ICMLA.2010.152","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.152","url":null,"abstract":"Constrained clustering through matrix factorization has been shown to largely improve clustering accuracy by incorporating prior knowledge into the factorization process. Although it has been well studied, none of them deal with constrained multi-way data factorization. Multi-way data or Tensors are encoded as high-order data structures. They can be seen as the generalization of matrices. One typical tensor is multiple two-way data/matrices in different time periods. To the best of our knowledge, this paper is the first work developing two general formulation of constrained nonnegative tensor factorization. An extensive experiment conducts a comparative study on the proposed constrained nonnegative tensor factorization and other state-of-the-art algorithms.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127772472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}