{"title":"Complex Decomposition of the Negative Distance Kernel","authors":"Tim vor der Brück, Steffen Eger, Alexander Mehler","doi":"10.1109/ICMLA.2015.151","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.151","url":null,"abstract":"A Support Vector Machine (SVM) has become a very popular machine learning method for text classification. One reason for this relates to the range of existing kernels which allow for classifying data that is not linearly separable. The linear, polynomial and RBF (Gaussian Radial Basis Function) kernel are commonly used and serve as a basis of comparison in our study. We show how to derive the primal form of the quadratic Power Kernel (PK) -- also called the Negative Euclidean Distance Kernel (NDK) -- by means of complex numbers. We exemplify the NDK in the framework of text categorization using the Dewey Document Classification (DDC) as the target scheme. Our evaluation shows that the power kernel produces F-scores that are comparable to the reference kernels, but is -- except for the linear kernel -- faster to compute. Finally, we show how to extend the NDK-approach by including the Mahalanobis distance.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130473389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EEG-based Secondary Task Detection in a Multiple Objective Operational Environment","authors":"Joseph J. Giametta, B. Borghetti","doi":"10.1109/ICMLA.2015.107","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.107","url":null,"abstract":"Real world operational environments often require the integration of complex multiple-objective tasks that necessitate split attention and individual prioritization in human operators. This study examines the effect of secondary task presence on operator electroencephalogram (EEG) activity in two different multiple-objective remotely piloted aircraft (RPA) simulations. Eight participants completed simulated aerial reconnaissance tasks of varying difficulties, while continuously monitoring and responding to radio traffic requesting distance, speed, and elevation calculations that required expedient mathematical reasoning. In these realistic dynamic task scenarios, balanced random forest and binary logistic regression classifiers are used to measure the effectiveness of 35 physiological markers in detecting operator workload changes. Results suggest that within-subject random forest models perform reasonably well even when trained using alternative primary tasks. Additionally, novel evidence supporting the importance of delta band (1-3Hz) brain activity for task detection is reported.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127837743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Vector Quantization of Hough Transform for Circle Detection","authors":"Bing Zhou","doi":"10.1109/ICMLA.2015.94","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.94","url":null,"abstract":"Circles are important patterns in many automatic image inspection applications. The Hough Transform (HT) is a popular method for extracting shapes from original images. It was first introduced for the recognition of straight lines, and later extended to circles. The drawbacks of standard Hough Transform for circle detection are the large computational and storage requirements. In this paper, we propose a modified HT called Vector Quantization of Hough Transform (VQHT) to detect circles more efficiently. The basic idea is to first decompose the edge image into many sub-images by using Vector Quantization algorithm based on their natural spatial relationship. The edge points resided in each sub-image are considered as one circle candidate group. Then the VQHT algorithm is applied for fast circle detection. Experimental results show that the proposed algorithm can quickly and accurately detect multiple circles from the noisy background.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131348857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Rajanna, Kamelia Aryafar, A. Shokoufandeh, R. Ptucha
{"title":"Deep Neural Networks: A Case Study for Music Genre Classification","authors":"A. Rajanna, Kamelia Aryafar, A. Shokoufandeh, R. Ptucha","doi":"10.1109/ICMLA.2015.160","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.160","url":null,"abstract":"Music classification is a challenging problem with many applications in today's large-scale datasets with Gigabytes of music files and associated metadata and online streaming services. Recent success with deep neural network architectures on large-scale datasets has inspired numerous studies in the machine learning community for various pattern recognition and classification tasks such as automatic speech recognition, natural language processing, audio classification and computer vision. In this paper, we explore a two-layer neural network with manifold learning techniques for music genre classification. We compare the classification accuracy rate of deep neural networks with a set of well-known learning models including support vector machines (SVM and '1-SVM), logistic regression and '1-regression in combination with hand-crafted audio features for a genre classification task on a public dataset. Our experimental results show that neural networks are comparable with classic learning models when the data is represented in a rich feature space.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125450625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Bayindir, M. Yesilbudak, U. Çetinkaya, H. Bulbul, F. Arslan
{"title":"Statistical Scenarios for Demand Forecast of a High Voltage Feeder: A Comparative Study","authors":"R. Bayindir, M. Yesilbudak, U. Çetinkaya, H. Bulbul, F. Arslan","doi":"10.1109/ICMLA.2015.34","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.34","url":null,"abstract":"The electricity demand forecasting has gained remarkable concern in energy market operation and planning with the emergence of deregulation in the power industry. Power system operators benefit from accurate demand forecasts by supporting investment decisions more objectively. As a crucial requirement, this paper focuses on hourly demand forecasts of a high voltage feeder. Moving average (MA), weighted moving average (WMA), autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) models have been used for creating statistical demand scenarios at 1-h, 2-h, 3-h and 4-h intervals. Many constructive comparisons have been conducted among MA, WMA, ARMA and ARIMA models comprehensively. Besides, the best statistical model employed in each hourly demand scenario provides the robust improvement percentage with respect to the persistence model.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123263857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alireza Fazelpour, T. Khoshgoftaar, D. Dittman, Amri Napolitano
{"title":"Does the Inclusion of Data Sampling Improve the Performance of Boosting Algorithms on Imbalanced Bioinformatics Data?","authors":"Alireza Fazelpour, T. Khoshgoftaar, D. Dittman, Amri Napolitano","doi":"10.1109/ICMLA.2015.23","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.23","url":null,"abstract":"Bioinformatics datasets contain many challenging characteristics, such as class imbalance, which adversely impacts the performance of supervised classification models built on these datasets. Techniques such as ensemble learning and data sampling from the domain of data mining can be deployed to alleviate the problem and to improve the classification performance. In this study, we sought to seek whether inclusion of data sampling within the ensemble framework can further improve the performance of classification models. To this end, we performed an experimental study using two newly hybrid ensemble techniques, one integrates feature selection within the boosting process and the other incorporates random under-sampling followed by feature selection within the boosting framework, two learners, three forms of feature rankers, and four feature subset sizes on 15 highly imbalanced bioinformatics datasets. Our results and statistical analysis demonstrate that the difference between the two boosting methods is statistically insignificant. Therefore, as the inclusion of data sampling has no significant positive effect on the performance of ensemble classifiers, it is not required to achieve maximum classification performance. To our knowledge, this is the first empirical study that examined the effects of data sampling, random under-sampling, to enhance classification performance of boosting algorithm for highly imbalanced bioinformatics data.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126384076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sara S. Abdelkader, Katarina Grolinger, Miriam A. M. Capretz
{"title":"Predicting Energy Demand Peak Using M5 Model Trees","authors":"Sara S. Abdelkader, Katarina Grolinger, Miriam A. M. Capretz","doi":"10.1109/ICMLA.2015.164","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.164","url":null,"abstract":"Predicting energy demand peak is a key factor for reducing energy demand and electricity bills for commercial customers. Features influencing energy demand are many and complex, such as occupant behaviours and temperature. Feature selection can decrease prediction model complexity without sacrificing performance. In this paper, features were selected based on their multiple linear regression correlation coefficients. This paper discusses the capabilities of M5 model trees in energy demand prediction for commercial buildings. M5 model trees are similar to regression trees, however they are more suitable for continuous prediction problems. The M5 model tree prediction was developed based on a selected feature set including sensor energy demand readings, day of the week, season, humidity, and weather conditions (sunny, rain, etc.). The performance of the M5 model tree was evaluated by comparing it to the support vector regression (SVR) and artificial neural networks (ANN) models. The M5 model tree outperformed the SVR and ANN models with a mean absolute error (MAE) of 8.94 compared to 10.02 and 12.04 for the SVR and ANN models respectively.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116056622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruth White, W. Harwin, W. Holderbaum, Laura Johnson
{"title":"Investigating Eating Behaviours Using Topic Models","authors":"Ruth White, W. Harwin, W. Holderbaum, Laura Johnson","doi":"10.1109/ICMLA.2015.50","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.50","url":null,"abstract":"Chronic conditions, such as diabetes and obesity are related to quality of diet. However, current research findings are conflicting with regards to the impact of snacking on diet quality. One reason for this is the lack of a clear definition of a snack or a meal. This paper presents a novel approach to understanding how foods are grouped together in eating events using a machine learning algorithm, topic models. Approaches for applying topic models to a nutrition application are discussed. A topic model is implemented for the UK National Diet and Nutrition Survey Rolling Programme dataset. The results demonstrate that the topics found are representative of typical eating events in terms of food group content and associated time of day. There is a strong potential for topic models to reveal useful patterns in food diary data that have not previously been considered.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116662402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Concept of 4th Dimension for Databases","authors":"E. Irmak, Ömer Kurtuldu","doi":"10.1109/ICMLA.2015.186","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.186","url":null,"abstract":"In these days, the data are being more and more important for not only social or commercial aspects but also military and security aspects. Therefore, storing the data accurately and accessing it exactly are quite important issues. Currently, most databases use 3 dimension (3D) data structure to store the physical parameters of real objects, which are width, length and depth/height. If the data have the four dimension for any object, it will definitely be more useful than 3D structure. In this paper, we investigated to how the time can be used as the 4th dimension for any object and the concepts of dynamic calculation of the time in order to store it in databases. Some type of objects have been selected as base shapes such as rectangular, cylinder, sphere, ellipse, pyramid and cone, for 4th dimension objects and a sample application is given in the study in order to explain how the time dimension can be used for databases.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121703289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Bayesian Classification Approach to Improving Performance for a Real-World Sales Forecasting Application","authors":"C. Gallagher, M. G. Madden, Brian D'Arcy","doi":"10.1109/ICMLA.2015.150","DOIUrl":"https://doi.org/10.1109/ICMLA.2015.150","url":null,"abstract":"Many businesses rely on forecasting techniques to detect whether sales opportunities are likely to be won or at risk of being lost. This enables the businesses to respond proactively. This paper describes a new method of sales forecasting that improves on an existing Qualitative Sales Predictor (QSP) in Hewlett-Packard (HP). QSP is based on a series of qualitative assessments that are made by sales personnel, the results of which are combined using weighted factors. In this research, we have developed an alternative method of forecasting sales opportunities, with three key differences: (1) the qualitative assessments are supplemented with quantitative data describing attributes of the opportunity, (2) we replace the weight factors with a Tree Augmented Naïve Bayes (TAN) classifier that can capture dependences between variables and produces a probabilistic output to which thresholds can be applied, (3) the TAN classifier is of course learned from historical data, whereas the existing QSP has fixed weights. Our approach has an accuracy of 90.6% in predicting whether sales will be won or lost, a substantial improvement on the existing approach's accuracy of 75.6% on the same unseen test data.","PeriodicalId":288427,"journal":{"name":"2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123815976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}