Iago Correa, Paulo L. J. Drews-Jr, S. Botelho, M. S. Souza, V. Tavano
{"title":"Deep Learning for Microalgae Classification","authors":"Iago Correa, Paulo L. J. Drews-Jr, S. Botelho, M. S. Souza, V. Tavano","doi":"10.1109/ICMLA.2017.0-183","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.0-183","url":null,"abstract":"Microalgae are unicellular organisms that presents limited physical characteristics such as size, shape or even the present structures. Classifying them manually may require great effort from experts since thousands of microalgae can be found in a small sample of water. Furthermore, the manual classification is a non-trivial operation. We proposed a deep learning technique to solve the problem. We also created a classified dataset that allow us to adopt this technique. To the best of our knowledge, the present work is the first one to apply this kind of technique on the microalgae classification task. The obtained results show the capabilities of the method to properly classify the data by using as input the low resolution images acquired by a particle analyzer instead of pre-processed features. We also show the improvement provided by the use of data augmentation technique.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"2 1","pages":"20-25"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90117770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michelangelo Diligenti, Soumali Roychowdhury, M. Gori
{"title":"Integrating Prior Knowledge into Deep Learning","authors":"Michelangelo Diligenti, Soumali Roychowdhury, M. Gori","doi":"10.1109/ICMLA.2017.00-37","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-37","url":null,"abstract":"Deep learning allows to develop feature representations and train classification models in a fully integrated way. However, learning deep networks is quite hard and it improves over shallow architectures only if a large number of training data is available. Injecting prior knowledge into the learner is a principled way to reduce the amount of required training data, as the learner does not need to induce the knowledge from the data itself. In this paper we propose a general and principled way to integrate prior knowledge when training deep networks. Semantic Based Regularization (SBR) is used as underlying framework to represent the prior knowledge, expressed as a collection of first-order logic clauses (FOL), and where each task to be learned corresponds to a predicate in the knowledge base. The knowledge base correlates the tasks to be learned and it is translated into a set of constraints which are integrated into the learning process via backpropagation. The experimental results show how the integration of the prior knowledge boosts the accuracy of a state-of-the-art deep network on an image classification task.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"44 1","pages":"920-923"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90462121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OP-DCI: A Riskless K-Means Clustering for Influential User Identification in MOOC Forum","authors":"X. Hou, Chi-Un Lei, Yu-Kwong Kwok","doi":"10.1109/ICMLA.2017.00-34","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-34","url":null,"abstract":"Massive Open Online Courses (MOOCs) have recently been highly popular among worldwide learners, while it is challenging to manage and interpret the large-scale discussion forum which is the dominant channel of online communication. K-Means clustering, one of the famous unsupervised learning algorithms, could help instructors identify influential users in MOOC forum, to better understand and improve online learning experience. However, traditional K-Means suffers from bias of outliers and risk of falling into local optimum. In this paper, OP-DCI, an optimized K-Means algorithm is proposed, using outlier post-labeling and distant centroid initialization. Outliers are not solely filtered out but extracted as distinct objects for post-labeling, and distant centroid initialization eliminates the risk of falling into local optimum. With OP-DCI, learners in MOOC forum are clustered efficiently with satisfactory interpretation, and instructors can subsequently design personalized learning strategies for different clusters.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"53 11","pages":"936-939"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91498787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Empirical Study of the Hidden Matrix Rank for Neural Networks with Random Weights","authors":"Pablo A. Henríquez, G. A. Ruz","doi":"10.1109/ICMLA.2017.00-44","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-44","url":null,"abstract":"Neural networks with random weights can be regarded as feed-forward neural networks built with a specific randomized algorithm, i.e., the input weights and biases are randomly assigned and fixed during the training phase, and the output weights are analytically evaluated by the least square method. This paper presents an empirical study of the hidden matrix rank for neural networks with random weights. We study the impacts of the scope of random parameters on the model's performance, and show that the assignment of the input weights in the range [-1,1] is misleading. Experiments were conducted using two types of neural networks obtaining insights not only on the input weights but also how these relate to different architectures.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"99 1","pages":"883-888"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81268753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Suresh Kumar Gudla, Joy Bose, Venugopal Gajam, S. Srinivasa
{"title":"Relevancy Ranking of User Recommendations of Services Based on Browsing Patterns","authors":"Suresh Kumar Gudla, Joy Bose, Venugopal Gajam, S. Srinivasa","doi":"10.1109/ICMLA.2017.00-66","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-66","url":null,"abstract":"There are a number of inbound web services, which recommend content to users. However, there is no way for such services to prioritize their recommendations as per the users' interests. Here we are not interested in generating new recommendations, but rather organizing and prioritizing existing recommendations in order to increase the click rate. Since users have different patterns of browsing that also change frequently, it is good to have a system that prioritizes recommendations based on the current browsing patterns of individual users. In this paper we present such a system. We first generate the clusters of article topics using URLs from the users' browsing history, which is then used to generate the relevancy scores of the recommendation services based on entropy. The relevancy scores are then fed to the service providers, which use them to prioritize their recommendations by ranking them based on the relevancy scores. We test the model using the browsing history for 10 users, and validate the model by calculating the correlation of the generated relevancy scores with the users' manually provided topic preferences. We further use collaborative filtering to benchmark the usefulness of our ranking systems.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"31 1","pages":"765-768"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76689519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Firouzi, Mahmood Karimian, Mahdieh Soleymani
{"title":"NMF-Based Label Space Factorization for Multi-label Classification","authors":"Mohammad Firouzi, Mahmood Karimian, Mahdieh Soleymani","doi":"10.1109/ICMLA.2017.0-144","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.0-144","url":null,"abstract":"Multi-label classification is a learning task in which each data sample can belong to more than one class. Until now, some methods that are based on reducing the dimensionality of the label space have been proposed. However, these methods have not used specific properties of the label space for this purpose. In this paper, we intend to find a hidden space in which both the input feature vectors and the label vectors are embedded. We propose a modified Non-Negative Matrix Factorization (NMF) method that is suitable for decomposing the label matrix and finding a proper hidden space by a feature-aware approach. We consider that the label matrix is binary and also in this matrix some deserving labels for an instance may not be on (called missing labels). We conduct several experiments and show the superiority of our proposed methods to the state-of-the-art multi- label classification methods.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"47 1","pages":"297-303"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78412779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Investigation of How Neural Networks Learn from the Experiences of Peers Through Periodic Weight Averaging","authors":"Joshua Smith, Michael S. Gashler","doi":"10.1109/ICMLA.2017.00-72","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-72","url":null,"abstract":"We investigate a method for cooperative learning called weighted average model fusion that enables neural networks to learn from the experiences of other networks, as well as from their own experiences. Modern machine learning methods have focused predominantly on learning from direct training, but many situations exist where the data cannot be aggregated, rendering direct learning impossible. However, we show that the simple approach of averaging weights with peer neural networks at periodic intervals enables neural networks to learn from second hand experiences. We analyze the effects that several meta-parameters have on model fusion to provide deeper insights into how they affect cooperative learning in a variety of scenarios.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"6 1","pages":"731-736"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88319651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. C. B. Martins, Rommel N. Carvalho, Ricardo Silva Carvalho, M. Victorino, M. Holanda
{"title":"Early Prediction of College Attrition Using Data Mining","authors":"L. C. B. Martins, Rommel N. Carvalho, Ricardo Silva Carvalho, M. Victorino, M. Holanda","doi":"10.1109/ICMLA.2017.000-6","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.000-6","url":null,"abstract":"College attrition is a chronic problem for institutions of higher education. In Brazilian public universities, attrition also accounts for the significant waste of public resources desperately needed in other sectors of society. Thus, given the severity and persistence of this problem, several studies have been conducted in an attempt to mitigate undergraduate dropout rates. Using H2O software as a data mining tool, our study employed parameter tuning to train 321 of three classification algorithms, and with Deep Learning, it was possible to predict 71.1% of the cases of dropout given these characteristics. With this result, it will be possible to identify the attrition profiles of students and implement corrective measures on initiating their studies.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"43 1","pages":"1075-1078"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87369594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anna Palczewska, Jan Palczewski, G. Aivaliotis, Lukasz Kowalik
{"title":"RobustSPAM for Inference from Noisy Longitudinal Data and Preservation of Privacy","authors":"Anna Palczewska, Jan Palczewski, G. Aivaliotis, Lukasz Kowalik","doi":"10.1109/ICMLA.2017.0-137","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.0-137","url":null,"abstract":"The availability of complex temporal datasets in social, health and consumer contexts has driven the development of pattern mining techniques that enable the use of classical machine learning tools for model building. In this work we introduce a robust temporal pattern mining framework for finding predictive patterns in complex timestamped multivariate and noisy data. We design an algorithm RobustSPAM that enables mining of temporal patterns from data with noisy timestamps. We apply our algorithm to social care data from a local government body and investigate how the efficiency and accuracy of the method depends on the level of noise. We further explore the trade-off between the loss of predictivity due to perturbation of timestamps and the risk of person re-identification.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"76 1","pages":"344-351"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80541163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Belgrave, R. Cassidy, D. Stamate, A. Custovic, L. Fleming, A. Bush, S. Saglani
{"title":"Predictive Modelling Strategies to Understand Heterogeneous Manifestations of Asthma in Early Life","authors":"D. Belgrave, R. Cassidy, D. Stamate, A. Custovic, L. Fleming, A. Bush, S. Saglani","doi":"10.1109/ICMLA.2017.0-176","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.0-176","url":null,"abstract":"Wheezing is common among children and ∼50% of those under 6 years of age are thought to experience at least one episode of wheeze. However, due to the heterogeneity of symptoms there are difficulties in treating and diagnosing these children. ‘Phenotype specific therapy’ is one possible avenue of treatment, whereby we use significant pathology and physiology to identify and treat pre-schoolers with wheeze. By performing feature selection algorithms and predictive modelling techniques, this study will attempt to determine if it is possible to robustly distinguish patient diagnostic categories among pre-school children. Univariate feature analysis identified more objective variables and recursive feature elimination a larger number of subjective variables as important in distinguishing between patient categories. Predicative modelling saw a drop in performance when subjective variables were removed from analysis, indicating that these variables are important in distinguishing wheeze classes. We achieved 90%+ performance in AUC, sensitivity, specificity, and accuracy, and 80%+ in kappa statistic, in distinguishing ill from healthy patients. Developed in a synergistic statistical - machine learning approach, our methodologies propose also a novel ROC Cross Evaluation method for model post-processing and evaluation. Our predictive modelling's stability was assessed in computationally intensive Monte Carlo simulations.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"1 1","pages":"68-75"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76564511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}