{"title":"Practical Active Learning with Model Selection for Small Data","authors":"Maryam Pardakhti, Nila Mandal, A. Ma, Qian Yang","doi":"10.1109/ICMLA52953.2021.00263","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00263","url":null,"abstract":"Active learning is of great interest for many practical applications, especially in industry and the physical sciences, where there is a strong need to minimize the number of costly experiments necessary to train predictive models. However, there remain significant challenges for the adoption of active learning methods in many practical applications. One important challenge is that many methods assume a fixed model, where model hyperparameters are chosen a priori. In practice, it is rarely true that a good model will be known in advance. Existing methods for active learning with model selection typically depend on a medium-sized labeling budget. In this work, we focus on the case of having a very small labeling budget, on the order of a few dozen data points, and develop a simple and fast method for practical active learning with model selection. Our method is based on an underlying pool-based active learner for binary classification using support vector classification with a radial basis function kernel. First we show empirically that our method is able to find hyperparameters that lead to the best performance compared to an oracle model on less separable, difficult to classify datasets, and reasonable performance on datasets that are more separable and easier to classify. Then, we demonstrate that it is possible to refine our model selection method using a weighted approach to trade-off between achieving optimal performance on datasets that are easy to classify, versus datasets that are difficult to classify, which can be tuned based on prior domain knowledge about the dataset.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"8 1","pages":"1647-1653"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79505921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shapelets-based Data Augmentation for Time Series Classification","authors":"Peiyu Li, S. F. Boubrahimi, S. M. Hamdi","doi":"10.1109/ICMLA52953.2021.00222","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00222","url":null,"abstract":"Data augmentation is an important data mining task that has been highly adopted to resolve class imbalance problems and provide more input to data-hungry models. For the case of time series data, the data augmentation method needs to take into consideration the dependence of the variables. In this paper, we propose a new model that preserves important relations between variables while performing time series data augmentation. In particular, we combine shapelets transform and Synthetic Minority Oversampling Technique (SMOTE) models to achieve the aforementioned goal. By using shapelets transform, the most prominent shapelets are extracted from the training set and used during the oversampling process. To make the most use of important shapelets, our proposed method preserves the extracted shapelets as the key part in the synthetic data sample. Then for the other parts of each synthetic data sample, we use SMOTE to generate the remaining data points. Compared with pure SMOTE, our method makes full use of important shapelets to maintain the important correlations between interdependent variables, which also can provide more interpretive information.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"74 1","pages":"1373-1378"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79857589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Deep Learning of Nonlinear Fiber-Optic Communications Using a Convolutional Recurrent Neural Network","authors":"A. Shahkarami, Mansoor I. Yousefi, Y. Jaouën","doi":"10.1109/ICMLA52953.2021.00112","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00112","url":null,"abstract":"Nonlinear channel impairments are a major obstacle in fiber-optic communication systems. To facilitate a higher data rate in these systems, the complexity of the underlying digital signal processing algorithms to compensate for these impairments must be reduced. Deep learning-based methods have proven successful in this area. However, the concept of computational complexity remains an open problem. In this paper, a low-complexity convolutional recurrent neural network (CNN + RNN) is considered for deep learning of the long-haul optical fiber communication systems where the channel is governed by the nonlinear Schrodinger equation. This approach reduces the computational complexity via balancing the computational load by capturing short-temporal distance features using strided convolution layers with ReLU activation, and the long-distance features using a many-to-one recurrent layer. We demonstrate that for a 16-QAM 100 G symbol/s system over 2000 km optical-link of 20 spans, the proposed approach achieves the bit-error-rate of the digital back-propagation (DBP) with substantially fewer floating-point operations (FLOPs) than the recently-proposed learned DBP, as well as the non-model-driven deep learning-based equalization methods using end-to-end MLP, CNN, RNN, and bi-RNN models.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"79 1","pages":"668-673"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84119434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pedro Roig Aparicio, Ricards Marcinkevics, Patricia Reis Wolfertstetter, S. Wellmann, C. Knorr, Julia E. Vogt
{"title":"Learning Medical Risk Scores for Pediatric Appendicitis","authors":"Pedro Roig Aparicio, Ricards Marcinkevics, Patricia Reis Wolfertstetter, S. Wellmann, C. Knorr, Julia E. Vogt","doi":"10.1109/ICMLA52953.2021.00243","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00243","url":null,"abstract":"Appendicitis is a common childhood disease, the management of which still lacks consolidated international criteria. In clinical practice, heuristic scoring systems are often used to assess the urgency of patients with suspected appendicitis. Previous work on machine learning for appendicitis has focused on conventional classification models, such as logistic regression and tree-based ensembles. In this study, we investigate the use of risk supersparse linear integer models (risk SLIM) for learning data-driven risk scores to predict the diagnosis, management, and complications in pediatric patients with suspected appendicitis on a dataset consisting of 430 children from a tertiary care hospital. We demonstrate the efficacy of our approach and compare the performance of learnt risk scores to previous analyses with random forests. Risk SLIM is able to detect medically meaningful features and outperforms the traditional appendicitis scores, while at the same time is better suited for the clinical setting than tree-based ensembles.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"35 1","pages":"1507-1512"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80277836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abigail Atchison, Gabriela Pinto, A. Woodward, Elizabeth Stevens, Dennis R. Dixon, Erik J. Linstead
{"title":"Classifying Challenging Behaviors in Autism Spectrum Disorder with Word Embeddings","authors":"Abigail Atchison, Gabriela Pinto, A. Woodward, Elizabeth Stevens, Dennis R. Dixon, Erik J. Linstead","doi":"10.1109/ICMLA52953.2021.00215","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00215","url":null,"abstract":"The understanding and treatment of challenging behaviors in individuals with Autism Spectrum Disorder are paramount to enabling the success of behavioral therapy; an essential step in this process is the labeling of challenging behaviors demonstrated in therapy sessions. This paper seeks to add quantitative depth to this otherwise qualitative task of challenging behavior classification. Here we leverage neural document embeddings with Word2Vec to represent clinical notes capturing 1,917 recorded instance of challenging behaviors from therapy sessions conducted by a large autism treatment provider. These embeddings then serve as training data for supervised machine learning algorithms in both binary and multiclass classification tasks to identify challenging behaviors, achieving high classification accuracies ranging from 82.7% to 98.5%. We demonstrate that the semantic queues derived from the language of challenging behavior descriptions, modeled using natural language processing techniques, can be successfully leveraged to extract and identify challenging behaviors from real-world clinical data.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"32 1","pages":"1325-1332"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80795809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jackson Cates, R. Hoover, Kyle A. Caudle, Riley Kopp, Cagri Ozdemir
{"title":"Transform-Based Tensor Auto Regression for Multilinear Time Series Forecasting","authors":"Jackson Cates, R. Hoover, Kyle A. Caudle, Riley Kopp, Cagri Ozdemir","doi":"10.1109/ICMLA52953.2021.00078","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00078","url":null,"abstract":"With the massive influx of 2-dimensional observational data, new methods for analyzing, modeling, and forecasting multidimensional data need to be developed. The current research aims to accomplish these goals through the intersection of time-series modeling and multi-linear algebraic systems. In particular, the current research, aptly named the $mathcal{L}$-Transform Tensor Auto-Regressive ($mathcal{L}$-TAR for short) model expands previous auto-regressive techniques to forecast data from multilinear observations as oppose to scalars or vectors. The approach is based on recent developments in tensor decompositions and multilinear tensor products. Transforming the multilinear data through invertible discrete linear transforms enables statistical Independence between observations. As such, can be reformulated to a collection of vector auto-regression problems for model learning. Experimental results are provided on benchmark datasets containing image collections, video sequences, sea surface temperature measurements, and stock closing prices.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"29 1","pages":"461-466"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83645473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"COVID-CBR: A Deep Learning Architecture Featuring Case-Based Reasoning for Classification of COVID-19 from Chest X-Ray Images","authors":"Xiaohong W. Gao, Alice Gao","doi":"10.1109/ICMLA52953.2021.00214","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00214","url":null,"abstract":"Background and Objectives: This study aims to assist rapid accurate diagnosis of COVID-19 based on chest x-ray (CXR) images to provide supplementary information, leading to screening program for early detection of COVID-19 based on CXR images by developing an interpretable, robust and performant AI system. Methods: A case-based reasoning approach built upon autoencoder deep learning architecture is applied to classify COVID-19 from other non-COVID-19 as well as normal subjects from chest x-ray images. The system integrates the interpretation and decision-making together by producing a set of profiles that in appearance resemble the training samples and hence explain the outcome of classifications. Three classes are studied, which are COVID-19 (n=250), other non-COVID-19 diseases (NCD) (n=384), including TB and ARDS, and normal (n=327). Results: This COVID-CBR system sustains the average sensitivity and specificity of 93.1±3.58% and 96.1±4.10% respectively for classification of these three classes. In comparison with the current state of the art, including COVID-Net, VGG-16 and other explainable AI systems, the developed COVID-CBR system appears to perform similar or better when classifying multi-class categories. Conclusion: This paper presents a case-based reasoning deep learning system for detection of COVID19 from chest x-ray images. Comparison with several state of the art systems is conducted. Although the improvement tends to be marginal, especially for VGG-16, the novelty of this work manifests its interpretable feature building upon case-based reasoning, leading to revealing this viral insight and hence ascertaining more effective treatment and drugs while maintaining being transparent. Furthermore, different from several other current explainable networks that highlight key regions or the points of an input that activate the network, i.e. heat maps, this work is constructed upon whole training images, i.e. case-based, whereby each training image belongs to one of the case clusters.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"38 1","pages":"1319-1324"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84843799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Thresholding Strategies for Highly Imbalanced and Noisy Data","authors":"Justin M. Johnson, T. Khoshgoftaar","doi":"10.1109/ICMLA52953.2021.00192","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00192","url":null,"abstract":"Many studies have shown that non-default decision thresholds are required to maximize classification performance on highly imbalanced data sets. Thresholding strategies include using a threshold equal to the prior probability of the positive class or identifying an optimal threshold on training data. It is not clear, however, how these thresholding strategies will generalize to imbalanced data sets that contain class label noise. When class noise is present, the positive class prior is influenced by the class label noise, and a threshold that is optimized on noisy training data may not generalize to test data. We employ four thresholding strategies: two thresholds that are optimized on training data and two thresholds that depend on the positive class prior. Threshold strategies are evaluated on a range of noise levels and noise distributions using the Random Forest, Multilayer Perceptron, and XGBoost learners. While all four thresholding strategies significantly outperform the default threshold with respect to the Geometric Mean (G-Mean), three of the four thresholds yield unstable true positive rates (TPR) and true negative rates (TNR) in the presence of class noise. Results show that setting the threshold equal to the prior probability of the noisy positive class consistently performs best according to G-Mean, TPR, and TNR. This is the first evaluation of thresholding strategies for imbalanced and noisy data, to the best of our knowledge, and our results contradict related works that have suggested optimizing thresholds on training data as the best approach.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"17 1","pages":"1182-1188"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87179896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Homogeneous Transfer Active Learning for Time Series Classification","authors":"P. Gikunda, Nicolas Jouandeau","doi":"10.1109/ICMLA52953.2021.00129","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00129","url":null,"abstract":"The scarcity of labeled time-series data is a major challenge in use of deep learning methods for Time Series Classification tasks. This is especially important for the growing field of sensors and Internet of things, where data of high dimensions and complex distributions coming from the numerous field devices has to be analyzed to provide meaningful applications. To address the problem of scarce training data, we propose a heuristic combination of deep transfer learning and deep active learning methods to provide near optimal training abilities to the classification model. To mitigate the need of labeling large training set, two essential criteria – informativeness and representativeness have been proposed for selecting time series training samples. After training the model on source dataset, we propose a framework for the model skill transfer to forecast certain weather variables on a target dataset in an homogeneous transfer settings. Extensive experiments on three weather datasets show that the proposed hybrid Transfer Active Learning method achieves a higher classification accuracy than existing methods, while using only 20% of the training samples.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"30 1","pages":"778-784"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90638247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aditya Singh, Anubhav Gupta, H. Wadhwa, Siddhartha Asthana, Ankur Arora
{"title":"Temporal Debiasing using Adversarial Loss based GNN architecture for Crypto Fraud Detection","authors":"Aditya Singh, Anubhav Gupta, H. Wadhwa, Siddhartha Asthana, Ankur Arora","doi":"10.1109/ICMLA52953.2021.00067","DOIUrl":"https://doi.org/10.1109/ICMLA52953.2021.00067","url":null,"abstract":"The tremendous rise of cryptocurrency in the payment domain has unlocked huge opportunities but also raised numerous challenges in parallel involving cybercriminal activities like money laundering, terrorist financing, illegal and risky services, etc, owing to its anonymous and decentralized setup. The demand for building a more transparent cryptocurrency network, resilient to such activities, has risen extensively as more financial institutions look to incorporate it into their network. While a plethora of traditional machine learning and graph based deep learning techniques have been developed to detect illicit activities in a cryptocurrency transaction network, the challenge of generalization and robust model performance on future timesteps still exists. In this paper, we show that the model learned on transactional feature set provided in dataset (Elliptic Dataset) carry a temporal bias, i.e. they are highly dependent on the timesteps they occur. Deploying temporally biased models limits their performance on future timesteps. To address this, we propose a temporal debiasing technique using GNN based architecture that ensures generalization by adversarially learning between fraud 1 classification and temporal classification. The adversarial loss constructed optimizes the embeddings to ensure they 1.) perform well on fraud classification task 2.) does not contain temporal bias. The proposed architecture capture the underlying fraud patterns that remain consistent over time. We evaluate the performance of our proposed architecture on the Elliptic dataset and compare the performance with existing machine learning and graph-based architectures.1Fraud and illicit are used interchangeably in this paper","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"4 1","pages":"391-396"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91011181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}