{"title":"Tree-Structured Curriculum Learning Based on Semantic Similarity of Text","authors":"Sanggyu Han, Sung-Hyon Myaeng","doi":"10.1109/ICMLA.2017.00-27","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-27","url":null,"abstract":"Inspired by the notion of a curriculum that allows human learners to acquire knowledge from easy to difficult materials, curriculum learning (CL) has been applied to many areas including Natural Language Processing (NLP). Most previous CL methods in NLP learn texts according to their lengths. We posit, however, that learning semantically similar texts is more effective than simply relying on superficial easiness such as text lengths. As such, we propose a new CL method that considers semantic dissimilarity as the complexity measure and a tree-structured curriculum as the organization method. The proposed CL method shows better performance than previous CL methods on a sentiment analysis task in an experiment.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"37 1","pages":"971-976"},"PeriodicalIF":0.0,"publicationDate":"2017-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90103826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Stamate, Andrea Katrinecz, W. Alghamdi, D. Ståhl, P. Delespaul, J. Os, S. Guloksuz
{"title":"Predicting Psychosis Using the Experience Sampling Method with Mobile Apps","authors":"D. Stamate, Andrea Katrinecz, W. Alghamdi, D. Ståhl, P. Delespaul, J. Os, S. Guloksuz","doi":"10.1109/ICMLA.2017.00-84","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-84","url":null,"abstract":"Smart phones have become ubiquitous in the recent years, which opened up a new opportunity for rediscovering the Experience Sampling Method (ESM) in a new efficient form using mobile apps, and provides great prospects to become a low cost and high impact mHealth tool for psychiatry practice. The method is used to collect longitudinal data of participants' daily life experiences, and is ideal to capture fluctuations in emotions (momentary mental states) as an early indicator for later mental health disorder. In this study ESM data of patients with psychosis and controls were used to examine emotion changes and identify patterns. This paper attempts to determine whether aggregated ESM data, in which statistical measures represent the distribution and dynamics of the original data, are able to distinguish patients from controls. Variable importance, recursive feature elimination and ReliefF methods were used for feature selection. Model training and tuning, and testing were performed in nested cross-validation, and were based on algorithms such as Random Forests, Support Vector Machines, Gaussian Processes, Logistic Regression and Neural Networks. ROC analysis was used to post-process these models. Stability of model performances was studied using Monte Carlo simulations. The results provide evidence that pattern in mood changes can be captured with the combination of techniques used. The best results were achieved by SVM with radial kernel, where the best model performed with 82% accuracy and 82% sensitivity.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"26 1","pages":"667-673"},"PeriodicalIF":0.0,"publicationDate":"2017-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81882553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Direct Multiclass Boosting Using Base Classifiers' Posterior Probabilities Estimates","authors":"M. Bourel, B. Ghattas","doi":"10.1109/ICMLA.2017.0-154","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.0-154","url":null,"abstract":"We present a new multiclass boosting algorithm called Adaboost.BG. Like the original Freund and Shapire's Adaboost algorithm, it aggregates trees but instead of using their misclassification error it takes into account the margins of the observations, which may be seen as confidence measures of their prediction, rather then their correctness. We prove the efficiency of our algorithm by simulation and compare it to similar approaches known to minimize the global margins of the final classifier.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"15 1","pages":"228-233"},"PeriodicalIF":0.0,"publicationDate":"2017-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73301143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human Action Recognition from Body-Part Directional Velocity Using Hidden Markov Models","authors":"Sid Ahmed Walid Talha, A. Fleury, S. Ambellouis","doi":"10.1109/ICMLA.2017.00-14","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-14","url":null,"abstract":"This paper introduces a novel approach for early recognition of human actions using 3D skeleton joints extracted from 3D depth data. We propose a novel, frame-by-frame and real-time descriptor called Body-part Directional Velocity (BDV) calculated by considering the algebraic velocity produced by different body-parts. A real-time Hidden Markov Models algorithm with Gaussian Mixture Models state-output distributions is used to carry out the classification. We show that our method outperforms various state-of-the-art skeleton-based human action recognition approaches on MSRAction3D and Florence3D datasets. We also proved the suitability of our approach for early human action recognition by deducing the decision from a partial analysis of the sequence.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"8 1","pages":"1035-1040"},"PeriodicalIF":0.0,"publicationDate":"2017-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84132570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Realistic Traffic Generation for Web Robots","authors":"Kyle Brown, Derek Doran","doi":"10.1109/ICMLA.2017.0-161","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.0-161","url":null,"abstract":"Critical to evaluating the capacity, scalability, and availability of web systems are realistic web traffic generators. Web traffic generation is a classic research problem, no generator accounts for the characteristics of web robots or crawlers that are now the dominant source of traffic to a web server. Administrators are thus unable to test, stress, and evaluate how their systems perform in the face of ever increasing levels of web robot traffic. To resolve this problem, this paper introduces a novel approach to generate synthetic web robot traffic with high fidelity. It generates traffic that accounts for both the temporal and behavioral qualities of robot traffic by statistical and Bayesian models that are fitted to the properties of robot traffic seen in web logs from North America and Europe. We evaluate our traffic generator by comparing the characteristics of generated traffic to those of the original data. We look at session arrival rates, inter-arrival times and session lengths, comparing and contrasting them between generated and real traffic. Finally, we show that our generated traffic affects cache performance similarly to actual traffic, using the common LRU and LFU eviction policies.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"341 1","pages":"178-185"},"PeriodicalIF":0.0,"publicationDate":"2017-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73754074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Hierarchical, Bulk-Synchronous Stochastic Gradient Descent Algorithm for Deep-Learning Applications on GPU Clusters","authors":"Guojing Cong, Onkar Bhardwaj","doi":"10.1109/ICMLA.2017.00-56","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-56","url":null,"abstract":"The training data and models are becoming increasingly large in many deep-learning applications. Large-scale distributed processing is employed to accelerate training. Increasing the number of learners in synchronous and asynchronous stochastic gradient descent presents challenges to convergence and communication performance. We present our hierarchical, bulk-synchronous stochastic gradient algorithm that effectively balances execution time and accuracy for training in deep-learning applications on GPU clusters. It achieves much better convergence and execution time at scale in comparison to asynchronous stochastic gradient descent implementations. When deployed on a cluster of 128 GPUs, our implementation achieves up to 56 times speedups over the sequential stochastic gradient descent with similar test accuracy for our target application.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"30 1","pages":"818-821"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73810257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advanced ECHMM-Based Machine Learning Tools for Complex Big Data Applications","authors":"A. Cuzzocrea, E. Mumolo, G. Vercelli","doi":"10.1109/ICMLA.2017.00-86","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00-86","url":null,"abstract":"We present a novel approach for accurate characterization of workloads, which is relevant in the context of complex big data applications.Workloads are generally described with statistical models and are based on the analysis of resource requests measurements of a running program. In this paper we propose to consider the sequence of virtual memory references generated from a program during its execution as a temporal series, and to use spectral analysis principles to process the sequence. However, the sequence is time-varying, so we employed processing approaches based on Ergodic Continuous Hidden Markov Models (ECHMMs) which extend conventional stationary spectral analysis approaches to the analysis of time-varying sequences. In this work, we describe two applications of the proposed approach: the on-line classification of a running process and the generation of synthetic traces of a given workload. The first step was to show that ECHMMs accurately describe virtual memory sequences; to this goal a different ECHMM was trained for each sequence and the related run-time average process classification accuracy, evaluated using trace driven simulations over a wide range of traces of SPEC2000, was about 82%. Then, a single ECHMM was trained using all the sequences obtained from a given running application; again, the classification accuracy has been evaluated using the same traces and it resulted about 76%. As regards the synthetic trace generation, a single ECHMM characterizing a given application has been used as a stochastic generator to produce benchmarks for spanning a large application space.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"38 1","pages":"655-660"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75660425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amnay Amimeur, Nhathai Phan, D. Dou, David Kil, B. Piniewski
{"title":"Time-Sensitive Behavior Prediction in a Health Social Network","authors":"Amnay Amimeur, Nhathai Phan, D. Dou, David Kil, B. Piniewski","doi":"10.1109/ICMLA.2017.000-4","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.000-4","url":null,"abstract":"Human behavior prediction is critical in understanding and addressing large scale health and social issues in online communities. Specifically, predicting when in the future a user will engage in a behavior as opposed to whether a user will behave at a particular time is a less studied subproblem of behavior prediction. Further lacking is exploration of how social context affects personal behavior and the exploitation of network structure information in behavior and time prediction. To address these problems we propose a novel semi-supervised deep learning model for prediction of return time to personal behavior. A carefully designed objective function ensures the model learns good social context embeddings and historical behavior embeddings in order to capture the effects of social influence on personal behavior. Our model is validated on a unique health social network dataset by predicting when users will engage in physical activities. We show our model outperforms relevant time prediction baselines.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"31 1","pages":"1083-1088"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75751864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Anomaly Prediction Based on k-Means Clustering for Memory-Constrained Embedded Devices","authors":"Yuto Kitagawa, Tasuku Ishigoka, Takuya Azumi","doi":"10.1109/ICMLA.2017.0-182","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.0-182","url":null,"abstract":"This paper proposes an anomaly prediction method based on k-means clustering that assumes embedded devices with memory constraints to predict control system anomalies. With this method, by checking control system behavior, it is possible to predict anomalies. However, continuing clustering is difficult because data accumulate in memory similar to existing k-means clustering method, which is problematic for embedded devices with low memory capacity. Therefore, we also propose k-means clustering to continue clustering for infinite stream data. The proposed k-means clustering method is based on online k-means clustering of sequential processing. The proposed k-means clustering method only stores data required for anomaly prediction and releases other data from memory. Experimental results show that anomalies can be predicted by k-means clustering, and the proposed method can predict anomalies similar to standard k-means clustering while reducing memory consumption. Moreover, the proposed k-means clustering demonstrates better results of anomaly prediction than existing online k-means clustering.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"22 1","pages":"26-33"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74774948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Badri, L. Badri, Oussama Hachemane, Alexandre Ouellet
{"title":"Exploring the Impact of Clone Refactoring on Test Code Size in Object-Oriented Software","authors":"M. Badri, L. Badri, Oussama Hachemane, Alexandre Ouellet","doi":"10.1109/ICMLA.2017.00098","DOIUrl":"https://doi.org/10.1109/ICMLA.2017.00098","url":null,"abstract":"This paper aims at exploring the impact of clone refactoring on the test code size, in terms of number of operations, in object-oriented software. We investigated three research questions: (1) the impact of clone refactoring on three important source code attributes (coupling, complexity and size) that are related to unit testability of classes, (2) the impact of clone refactoring on the test code size, and (3) the variations after clone refactoring in the source code attributes that have the most important impact on the test code size. We used linear regression and three popular machine learning techniques (i.e., k-Nearest Neighbors, Naïve Bayes and Random Forest) to develop predictive and explanatory models. We used data collected from an open source Java software system (ANT) that has been refactored using clone-refactoring techniques. The analyses indicate that there is a strong and positive relationship between clone refactoring and the reduction of the test code size. Results show that: (1) the source code attributes of refactored classes have been significantly improved, (2) the test code size of refactored classes has been significantly reduced, and (3) the variations of the test code size are more influenced by the variations of the complexity and size of refactored classes compared to coupling.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"63 1","pages":"586-592"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75030536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}