R. Langone, C. Alzate, A. Bey-Temsamani, J. Suykens
{"title":"Alarm prediction in industrial machines using autoregressive LS-SVM models","authors":"R. Langone, C. Alzate, A. Bey-Temsamani, J. Suykens","doi":"10.1109/CIDM.2014.7008690","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008690","url":null,"abstract":"In industrial machines different alarms are embedded in machines controllers. They make use of sensors and machine states to indicate to end-users various information (e.g. diagnostics or need of maintenance) or to put machines in a specific mode (e.g. shut-down when thermal protection is activated). More specifically, the alarms are often triggered based on comparing sensors data to a threshold defined in the controllers software. In batch production machines, triggering an alarm (e.g. thermal protection) in the middle of a batch production is crucial for the quality of the produced batch and results into a high production loss. This situation can be avoided if the settings of the production machine (e.g. production speed) is adjusted accordingly based on the temperature monitoring. Therefore, predicting a temperature alarm and adjusting the production speed to avoid triggering the alarm seems logical. In this paper we show the effectiveness of Least Squares Support Vector Machines (LS-SVMs) in predicting the evolution of the temperature in a steel production machine and, as a consequence, possible alarms due to overheating. Firstly, in an offline fashion, we develop a nonlinear autoregressive (NAR) model, where a systematic model selection procedure allows to carefully tune the model parameters. Afterwards, the NAR model is used online to forecast the future temperature trend. Finally, a classifier which uses as input the outcomes of the NAR model allows to foresee future alarms.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"520 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123066461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating topic quality using model clustering","authors":"V. Mehta, R. Caceres, K. Carter","doi":"10.1109/CIDM.2014.7008665","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008665","url":null,"abstract":"Topic modeling continues to grow as a popular technique for finding hidden patterns, as well as grouping collections of new types of text and non-text data. Recent years have witnessed a growing body of work in developing metrics and techniques for evaluating the quality of topic models and the topics they generate. This is particularly true for text data where significant attention has been given to the semantic interpretability of topics using measures such as coherence. It has been shown however that topic assessments based on coherence metrics do not always align well with human judgment. Other efforts have examined the utility of information-theoretic distance metrics for evaluating topic quality in connection with semantic interpretability. Although there has been progress in evaluating interpretability of topics, the existing intrinsic evaluation metrics do not address some of the other aspects of concern in topic modeling such as: the number of topics to select, the ability to align topics from different models, and assessing the quality of training data. Here we propose an alternative metric for characterizing topic quality that addresses all three aforementioned issues. Our approach is based on clustering topics, and using the silhouette measure, a popular clustering index, for characterizing the quality of topics. We illustrate the utility of this approach in addressing the other topic modeling concerns noted above. Since this metric is not focused on interpretability, we believe it can be applied more broadly to text as well as non-text data. In this paper however we focus on the application of this metric to archival and non-archival text data.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"485 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123057660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Weighted one-class classification for different types of minority class examples in imbalanced data","authors":"B. Krawczyk, Michal Wozniak, F. Herrera","doi":"10.1109/CIDM.2014.7008687","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008687","url":null,"abstract":"Imbalanced classification is one of the most challenging machine learning problem. Recent studies show, that often the uneven ratio of objects in classes is not the biggest factor, determining the drop of classification accuracy. It is also related to some difficulties embedded in the nature of the data. In this paper we study the different types of minority class examples and distinguish four groups of objects - safe, borderline, rare and outliers. To deal with the imbalance problem, we use a one-class classification, that is focused on a proper identification of the minority class samples. We further augment this model by incorporating the knowledge about the minority object types in the training dataset. This is done applying weighted one-class classifier and adjusting weights assigned to minority class objects, depending on their type. A strategy for calculating the new weights for minority examples is proposed. Experimental analysis, carried on a set of benchmark datasets, confirms that the proposed model can achieve a satisfactory recognition rate and often outperform other state-of-the-art methods, dedicated to the imbalanced classification.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121546918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Agglomerative hierarchical kernel spectral data clustering","authors":"Raghvendra Mall, R. Langone, J. Suykens","doi":"10.1109/CIDM.2014.7008142","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008142","url":null,"abstract":"In this paper we extend the agglomerative hierarchical kernel spectral clustering (AH-KSC [1]) technique from networks to datasets and images. The kernel spectral clustering (KSC) technique builds a clustering model in a primal-dual optimization framework. The dual solution leads to an eigen-decomposition. The clustering model consists of kernel evaluations, projections onto the eigenvectors and a powerful out-of-sample extension property. We first estimate the optimal model parameters using the balanced angular fitting (BAF) [2] criterion. We then exploit the eigen-projections corresponding to these parameters to automatically identify a set of increasing distance thresholds. These distance thresholds provide the clusters at different levels of hierarchy in the dataset which are merged in an agglomerative fashion as shown in [1], [4]. We showcase the effectiveness of the AH-KSC method on several datasets and real world images. We compare the AH-KSC method with several agglomerative hierarchical clustering techniques and overcome the issues of hierarchical KSC technique proposed in [5].","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133953833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahmad Slim, G. Heileman, Jarred Kozlick, C. Abdallah
{"title":"Predicting student success based on prior performance","authors":"Ahmad Slim, G. Heileman, Jarred Kozlick, C. Abdallah","doi":"10.1109/CIDM.2014.7008697","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008697","url":null,"abstract":"Colleges and universities are increasingly interested in tracking student progress as they monitor and work to improve their retention and graduation rates. Ideally, early indicators of student progress, or lack thereof, can be used to provide appropriate interventions that increase the likelihood of student success. In this paper we present a framework that uses machine learning, and in particular, a Bayesian Belief Network (BBN), to predict the performance of students early in their academic careers. The results obtained show that the proposed framework can predict student progress, specifically student grade point average (GPA) within the intended major, with minimal error after observing a single semester of performance. Furthermore, as additional performance is observed, the predicted GPA in subsequent semesters becomes increasingly accurate, providing the ability to advise students regarding likely success outcomes early in their academic careers.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131966107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fiona J. Buckingham, Keeley A. Crockett, Z. Bandar, J. O'Shea
{"title":"FATHOM: A neural network-based non-verbal human comprehension detection system for learning environments","authors":"Fiona J. Buckingham, Keeley A. Crockett, Z. Bandar, J. O'Shea","doi":"10.1109/CIDM.2014.7008696","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008696","url":null,"abstract":"This paper presents the application of FATHOM, a computerised non-verbal comprehension detection system, to distinguish participant comprehension levels in an interactive tutorial. FATHOM detects high and low levels of human comprehension by concurrently tracking multiple non-verbal behaviours using artificial neural networks. Presently, human comprehension is predominantly monitored from written and spoken language. Therefore, a large niche exists for exploring human comprehension detection from a non-verbal behavioral perspective using artificially intelligent computational models such as neural networks. In this paper, FATHOM was applied to a video-recorded exploratory study containing a learning task designed to elicit high and low comprehension states from the learner. The learning task comprised of watching a video on termites, suitable for the general public and an interview led question and answer session. This paper describes how FATHOM's comprehension classifier artificial neural network was trained and validated in comprehension detection using the standard backpropagation algorithm. The results show that high and low comprehension states can be detected from learner's non-verbal behavioural cues with testing classification accuracies above 76%.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"257 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132368629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"What can spatial collectives tell us about their environment?","authors":"Zena Wood","doi":"10.1109/CIDM.2014.7008686","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008686","url":null,"abstract":"Understanding how large groups of individuals move within their environment, and the social interactions that occur during this movement, is central to many fundamental interdisciplinary research questions; ranging from understanding the evolution of cooperation, to managing human crowd behaviour. If we could understand how groups of individuals interact with their environment, and any role that the environment plays in their behaviour, we could design and develop space to better suit their needs. Spatiotemporal datasets that record the movement of large groups of individuals are becoming increasingly available. A method, based on a set of coherence criteria, has previously been developed to identify different types of collective within such datasets. However, further investigations have revealed that the method can be used to reveal important information about the environment. This paper applies the method to a spatiotemporal dataset that records the movements of ships within the Solent, in the UK, over a twenty-four hour period to explore what can be inferred from the movement of groups of individuals, referred to as spatial collectives, regarding the environment.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"259 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132881906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Product aspect identification: Analyzing role of different classifiers","authors":"Xing Hu, S. Manna, Brian N. Truong","doi":"10.1109/CIDM.2014.7008668","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008668","url":null,"abstract":"With the rapid advancement of eCommerce, it has become a common trend for customers to write reviews about any product they purchase. For certain popular products, such as cell phones, laptops, tablets, the number of reviews can be hundreds or even thousands, making it difficult for potential customers to identify specific aspect based overview of the product (for example, screen, camera, battery etc). This paper studies different classifiers for aspect identification from unlabeled free-form textual customer reviews. Firstly, a multi-aspect classification is proposed to learn implicit and explicit aspect-related context from the reviews for aspect identification, which does not require any manually labeled training data. Secondly, extensive experiments for analyzing the effectiveness of classifiers and feature selection for aspect identification have also been shown. The results of our experiments on smartphone reviews from Amazon show that Support Vector Machine's accuracy in aspect identification is best, followed by Random Forest and Naive Bayes.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134604547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The use of process mining in a business process simulation context: Overview and challenges","authors":"Niels Martin, B. Depaire, A. Caris","doi":"10.1109/CIDM.2014.7008693","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008693","url":null,"abstract":"This paper focuses on the potential of process mining to support the construction of business process simulation (BPS) models. To date, research efforts are scarce and have a rather conceptual nature. Moreover, publications fail to explicit the complex internal structure of a simulation model. The current paper outlines the general structure of a BPS model. Building on these foundations, modeling tasks for the main components of a BPS model are identified. Moreover, the potential value of process mining and the state of the art in literature are discussed. Consequently, a multitude of promising research challenges are identified. In this sense, the current paper can guide future research on the use of process mining in a BPS context.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"473 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131934482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gender classification of subjects from cerebral blood flow changes using Deep Learning","authors":"T. Hiroyasu, K. Hanawa, U. Yamamoto","doi":"10.1109/CIDM.2014.7008672","DOIUrl":"https://doi.org/10.1109/CIDM.2014.7008672","url":null,"abstract":"In this study, using Deep Learning, the gender of subjects is classified the cerebral blood flow changes that are measured by fNIRS. It is reported that cerebral blood flow changes are triggered by brain activities. Thus, if this classification has a high searching accuracy, gender classification should be related to brain activities. In the experiment, fNIRS data are derived from subjects who perform a memory task in white noise environment. From the results, it is confirmed that the learning classifier exhibits high accuracy. This fact suggests that there exists a relation between cerebral blood flow changes and biological information.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129110048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}