{"title":"Implementation of Parameter Space Search for Meta Learning in a Data-Mining Multi-agent System","authors":"O. Kazík, K. Pesková, M. Pilát, Roman Neruda","doi":"10.1109/ICMLA.2011.161","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.161","url":null,"abstract":"In this paper an implementation of a multi-agent system designed for solving complex data mining tasks is presented. The system is based on ontologically sound AGR (agents, groups, roles) model and encapsulates Weka library methods in JADE agents. We emphasize the unique intelligent features of the system -- its ability to search the parameter space of the data mining methods to find the optimal configuration, and meta learning -- finding the best possible method for the given data based on the ontological compatibility of datasets.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125036549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Ravikumar, Haibin Liu, J. Cohn, M. Wall, Karin M. Verspoor
{"title":"Pattern Learning through Distant Supervision for Extraction of Protein-Residue Associations in the Biomedical Literature","authors":"K. Ravikumar, Haibin Liu, J. Cohn, M. Wall, Karin M. Verspoor","doi":"10.1109/ICMLA.2011.112","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.112","url":null,"abstract":"We propose a method enabling automatic extraction of protein-specific residues from the biomedical literature. We aim to associate mentions of specific amino acids to the protein of which the residue forms a part. The methods presented in this work will enable improved protein functional site extraction from articles, ultimately supporting protein function prediction. Our method made use of linguistic patterns for identifying the amino acid residue mentions in text. Further, we applied an automated graph-based method to learn syntactic and semantic patterns corresponding to protein-residue pairs mentioned in the text. On a new automatically generated data set of high confidence protein-residue relationship sentences, established through distant supervision, the method achieved a F-measure of 0.78. This work will pave the way to improved extraction of protein functional residues from the literature.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125057589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Structured Multivariate Pattern Classification to Detect MRI Markers for an Early Diagnosis of Alzheimer's Disease","authors":"C. Damon, E. Duchesnay, M. Depecker","doi":"10.1109/ICMLA.2011.185","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.185","url":null,"abstract":"Multiple kernel learning (MKL) provides flexibility by considering multiple data views and by searching for the best data representation through a combination of kernels. Clinical applications of neuroimaging have seen recent upsurge of the use of multivariate machine learning methods to predict clinical status. However, they usually do not model structured information, such as cerebral spatial and functional networking, which could improve the predictive capacity of the model and which could be more meaningful for further neuroscientific interpretation. In this study, we applied a MKL-based approach to predict prodromal stage of Alzheimer disease (i.e. early phase of the illness) with prior structured knowledges about the brain spatial neighborhood structure and the brain functional circuits linked to cognitve decline of AD. Compared to a set of classical multivariate linear classifiers, each one highlighting specific strategies, the smooth MKL-SVM method (i.e. Lp MKL-SVM) appeared to be the most powerful to distinguish both very mild and mild AD patients from healthy subjets.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127711693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Web Ad-Slot Offline Scheduling Using an Ant Colony Algorithm","authors":"V. Palade, S. Banerjee","doi":"10.1109/ICMLA.2011.158","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.158","url":null,"abstract":"Online advertisements (ads) placed at different positions on a web page get different number of 'hits' depending on the position and the time the ad occupies a particular position of advertisement on the web page. The management of online advertisement slots (ad-slots) on a web page is a dynamic problem and various derivative free optimization techniques could be employed for solving it. This paper presents an ant colony based algorithm for assigning bidders to click generating ad-slots. The objective is to maximize the profit obtained from clicks on ads, under some budget constraints for bidders and some scheduling constraints on the slots. A few instances of results for ads' allocation and bidding have been presented in the paper and demonstrate the approach.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121226693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discovering Clusters with Arbitrary Shapes and Densities in Data Streams","authors":"A. Magdy, N. A. Yousri, Nagwa M. El-Makky","doi":"10.1109/ICMLA.2011.56","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.56","url":null,"abstract":"The availability of streaming data in different fields and in various forms increases the importance of streaming data analysis. The huge size of a continuously flowing data has put forward a number of challenges in data stream analysis. Exploration of the structure of streamed data represented a major challenge that resulted in introducing various clustering algorithms. However, current clustering algorithms still lack the ability to efficiently discover clusters of arbitrary densities in data streams. In this paper, a new grid-based and density-based algorithm is proposed for clustering streaming data. It addresses drawbacks of recent algorithms in discovering clusters of arbitrary densities. The algorithm uses an online component to map the input data to grid cells. An offline component is then used to cluster the grid cells based on density information. Relative density relatedness measures and a dynamic range neighborhood are proposed to differentiate clusters of arbitrary densities. The experimental evaluation shows considerable improvements upon the state-of-the-art algorithms in both clustering quality and scalability. In addition, the output quality of the proposed algorithm is less sensitive to parameter selection errors.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116325741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ana Rebelo, J. Tkaczuk, R. Sousa, Jaime S. Cardoso
{"title":"Metric Learning for Music Symbol Recognition","authors":"Ana Rebelo, J. Tkaczuk, R. Sousa, Jaime S. Cardoso","doi":"10.1109/ICMLA.2011.94","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.94","url":null,"abstract":"Although Optical Music Recognition (OMR) has been the focus of much research for decades, the processing of handwritten musical scores is not yet satisfactory. The efforts made to find robust symbol representations and learning methodologies have not found a similar quality in the learning of the dissimilarity concept. Simple Euclidean distances are often used to measure dissimilarity between different examples. However, such distances do not necessarily yield the best performance. In this paper, we propose to learn the best distance for the k-nearest neighbor (k-NN) classifier. The distance concept will be tuned both for the application domain and the adopted representation for the music symbols. The performance of the method is compared with the support vector machine (SVM) classifier using both real and synthetic music scores. The synthetic database includes four types of deformations inducing variability in the printed musical symbols which exist in handwritten music sheets. The work presented here can open new research paths towards a novel automatic musical symbols recognition module for handwritten scores.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122028825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Infinite Dirichlet Mixture Model and Its Application via Variational Bayes","authors":"Wentao Fan, N. Bouguila","doi":"10.1109/ICMLA.2011.81","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.81","url":null,"abstract":"In this paper, we propose a Bayesian nonparametric approach for modeling and selection based on the mixture of Dirichlet processes with Dirichlet distributions, which can also be considered as an infinite Dirichlet mixture model. The proposed model adopts a stick-breaking representation of the Dirichlet process and is learned through a variational inference method. In our approach, the determination of the number of clusters is sidestepped by assuming an infinite number of clusters. The effectiveness of our approach is tested on a real application involving unsupervised image categorization.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128371018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Method to Generate Virtual Samples for Solving Small Sample Set Problems","authors":"A. Dehghani, Jun Zheng","doi":"10.1109/ICMLA.2011.18","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.18","url":null,"abstract":"As confirmed by theory and experiments, a key factor in successfully solving a supervised learning task, especially in the case that the hypothesis is highly complex, is the number of samples available to the learner. On the other hand, in real world applications, it may not be able to provide enough number of training samples to the learner because of high acquisition cost or incapability of obtaining samples. In this paper, we propose a method addressing the problem of learning with small sample set by generating additional virtual samples. In absence of any useful prior knowledge about the functional form of the target model, we take a closer look at the distribution patterns of available samples in low dimensional subspaces and constitute the rules that each sample, including virtual samples, must obey. These rules along with other problem constraints are used as weak conditions to refine the virtual samples through an optimization process. The method is applied to two real-world learning problems. The experimental results support the efficiency of the method for solving the small sample set problems.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130846044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Regional Normal Liver Tissue Density Changes in Patients Treated with Stereotactic Body Radiation Therapy for Liver Metastases","authors":"C. Howells, Q. Diot, D. Westerly, M. Miften","doi":"10.1109/ICMLA.2011.121","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.121","url":null,"abstract":"A quantitative approach to evaluate stereo tactic body radiation therapy (SBRT)-induced normal liver tissue changes in patients with liver metastases was performed. 104 non-contrast treatment follow-up computed topography (CT) scans of 35 patients who received SBRT between 2004 and 2011 were retrospectively analyzed (range, 0.7-36 months, median, 8.1 months). The dose distributions from planning CTs were mapped to follow-up CTs using rigid registration. SBRT-induced normal liver density changes on post-SBRT follow-up CT scans were evaluated at approximately 4, 8, 12, 18, and 36 months. Dose-response curves (DRCs) were generated over the entire patient population by computing the mean Hounsfield unit (HU) in liver regions corresponding to dose bins ranging from 0-55 Gy in 5 Gy intervals. A hypo dense radio logic change in irradiated liver linearly related to dose (slope, -0.13 ÄHU/Gy) was observed, with significant mean CT changes of-9.3 ± 0.64 ÄHU and-9.8 ± 0.75 ÄHU at 45-50 Gy and 50-55 Gy, respectively. Furthermore, the data revealed that SBRT induces this hypo dense radiation reaction with demarcation set by the 30 to 35 Gy iso dose volume.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130855523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Transfer Learning via Restricted Boltzmann Machine for Document Classification","authors":"Jian Zhang","doi":"10.1109/ICMLA.2011.51","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.51","url":null,"abstract":"Transfer learning aims to improve a targeted learning task using other related auxiliary learning tasks and data. Most current transfer-learning methods focus on scenarios where the auxiliary and the target learning tasks are very similar: either (some of) the auxiliary data can be directly used as training examples for the target task or the auxiliary and the target data share the same representation. However, in many cases the connection between the auxiliary and the target tasks can be remote. Only a few features derived from the auxiliary data may be helpful for the target learning. We call such scenario the deep transfer-learning scenario and we introduce a novel transfer-learning method for deep transfer. Our method uses restricted Boltzmann machine to discover a set of hierarchical features from the auxiliary data. We then select from these features a subset that are helpful for the target learning, using a selection criterion based on the concept of kernel-target alignment. Finally, the target data are augmented with the selected features before training. Our experiment results show that this transfer method is effective. It can improve classification accuracy by up to more than 10%, even when the connection between the auxiliary and the target tasks is not apparent.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131694525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}