{"title":"Compression of Deep Learning Models for NLP","authors":"Manish Gupta","doi":"10.1145/3486001.3486247","DOIUrl":"https://doi.org/10.1145/3486001.3486247","url":null,"abstract":"In recent years, the fields of NLP and information retrieval have made tremendous progress thanks to deep learning models like RNNs and LSTMs, and Transformer [36] based models like BERT [9]. But these models are humongous in size. Real world applications however demand small model size, low response times and low computational power wattage. We will discuss six different types of methods (pruning, quantization, knowledge distillation, parameter sharing, matrix decomposition, and other Transformer based methods) for compression of such models to enable their deployment in real industry NLP projects. Given the critical need of building applications with efficient and small models, and the large amount of recently published work in this area, we believe that this tutorial is very timely. We will organize related work done by the ‘deep learning for NLP’ community in the past few years and present it as a coherent story.","PeriodicalId":266754,"journal":{"name":"Proceedings of the First International Conference on AI-ML Systems","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126648603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Durgadoss, Kausik Maiti, Sanju C Sudhakaran, Isha Agarwal, Kartik Podugu, Pavan Kumar, Jitender Patil, A. Chawla
{"title":"Optimizing Convolutions for an Inference Accelerator: Case Study: Intel’s NNP-I 1000 DL Compute Grid","authors":"Durgadoss, Kausik Maiti, Sanju C Sudhakaran, Isha Agarwal, Kartik Podugu, Pavan Kumar, Jitender Patil, A. Chawla","doi":"10.1145/3486001.3486239","DOIUrl":"https://doi.org/10.1145/3486001.3486239","url":null,"abstract":"With Deep Learning (DL) surpassing humans in Image Recognition and Machine Translation related tasks, the demand for specialized hardware has increased in the recent past. DL Accelerators belong to a category of such purpose-built hardware that promise compelling performance for Neural Net computations. But a specialized hardware needs a powerful compiler to unlock its full potential. This paper discusses the Code Generator and Optimizer (CGO) that produces optimized tiling as well as schedule of Convolution operations for the DL Compute Grid in Intel’s NNP-I 1000 platform. This paper also presents some of the key optimization techniques used and the associated performance gains across a rich variety of Deep Learning workloads.","PeriodicalId":266754,"journal":{"name":"Proceedings of the First International Conference on AI-ML Systems","volume":"61 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129533350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nayna Jain, Karthik Nandakumar, N. Ratha, Sharath Pankanti, U. Kumar
{"title":"CryptInfer: Enabling Encrypted Inference on Skin Lesion Images for Melanoma Detection","authors":"Nayna Jain, Karthik Nandakumar, N. Ratha, Sharath Pankanti, U. Kumar","doi":"10.1145/3486001.3486233","DOIUrl":"https://doi.org/10.1145/3486001.3486233","url":null,"abstract":"Deep learning models such as Convolutional Neural Networks (CNNs) have shown the potential to classify medical images for accurate diagnosis. These techniques will face regulatory compliance challenges related to privacy of user data, especially when they are deployed as a service on a cloud platform. Fully Homomorphic Encryption (FHE) can enable CNN inference on encrypted data and help mitigate such concerns. However, encrypted CNN inference faces the fundamental challenge of optimizing the computations to achieve an acceptable trade-off between accuracy and practical computational feasibility. Current approaches for encrypted CNN inference demonstrate feasibility typically on smaller images (e.g., MNIST and CIFAR-10 datasets) and shallow neural networks. This work is the first to show encrypted inference results on a real-world dataset for melanoma detection with large-sized images of skin lesions based on the Cheon-Kim-Kim-Song (CKKS) encryption scheme available in the open-source HElib library. The practical challenges related to encrypted inference are first analyzed and inference experiments are conducted on encrypted MNIST images to evaluate different optimization strategies and their role in determining the throughput and latency of the inference process. Using these insights, a modified LeNet-like architecture is designed and implemented to achieve the end goal of enabling encrypted inference on melanoma dataset. The results demonstrate that 80% classification accuracy can be achieved on encrypted skin lesion images (security of 106 bits) with a latency of 51 seconds for single image inference and a throughput of 18,000 images per hour for batched inference, which shows that privacy-preserving machine learning as a service (MLaaS) based on encrypted data is indeed practically feasible.","PeriodicalId":266754,"journal":{"name":"Proceedings of the First International Conference on AI-ML Systems","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127937227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time Social Distancing Detection System with Auto-calibration using Pose Information","authors":"Gaku Nakano, Shoji Nishimura","doi":"10.1145/3486001.3486245","DOIUrl":"https://doi.org/10.1145/3486001.3486245","url":null,"abstract":"In this demo, we present a real-time social distancing system with auto-calibration using human pose information. Our system first calculates geometric parameters of a camera in 3D space, i.e. position and rotation, then, measures absolute distances between pedestrians. Detection results are visualized as 3D circles in input images and a bird’s eye view. All processing steps are completely automatic, therefore, our system can be applied for uncalibrated surveillance cameras, which are already installed in town. A demo video is available in the supplemental material.","PeriodicalId":266754,"journal":{"name":"Proceedings of the First International Conference on AI-ML Systems","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128800770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elias Werner, Lalith Manjunath, Jan Frenzel, Sunna Torge
{"title":"Bridging between Data Science and Performance Analysis: Tracing of Jupyter Notebooks","authors":"Elias Werner, Lalith Manjunath, Jan Frenzel, Sunna Torge","doi":"10.1145/3486001.3486249","DOIUrl":"https://doi.org/10.1145/3486001.3486249","url":null,"abstract":"In the last years, an increasing amount of available data has led to new application approaches and an application field that is now called data science (DS). Such applications often require low runtimes while having to deal with restricted compute resources. Up to now, we perceive that the DS community lacks tool support for runtime and resource usage investigations. Thus, we present an approach that combines DS and performance analysis from the High Performance Computing domain. Our concept integrates the measurement framework Score-P in Jupyter, a popular editor for the development of DS applications. We designed and implemented a custom Jupyter kernel that collects runtime data and applied it to a natural language processing application. The measurement overhead was 12.55 seconds. The benefits are, that the collected data can then be visualised using established performance analysis tools.","PeriodicalId":266754,"journal":{"name":"Proceedings of the First International Conference on AI-ML Systems","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123350857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Word-level beam search decoding and correction algorithm (WLBS) for end-to-end ASR","authors":"S. Zitha, Prabhakar Venkata Tamma","doi":"10.1145/3486001.3486223","DOIUrl":"https://doi.org/10.1145/3486001.3486223","url":null,"abstract":"A key challenge in resource-constrained speech recognition applications is the unavailability of a large, domain-specific audio corpus to train the models. In such scenarios, models may not be exposed to a wide range of domain-specific words and phrases. In this work, we propose an approach to improve the in-domain automatic speech recognition results using our word-level beam search decoding and correction algorithm (WLBS). We use a token-based language model to mitigate the data sparsity and the out of vocabulary issues in the corpus. We evaluate the proposed approach for airplane-cabin specific announcements use case. The experimental results show that the WLBS algorithm with its handling of misspellings and missing words achieves better performance than state-of-the-art beam search decoding and n-gram LMs. We report a WER of 11.48% on our airplane-cabin announcement test corpus.","PeriodicalId":266754,"journal":{"name":"Proceedings of the First International Conference on AI-ML Systems","volume":"44 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114009862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shakkeel Ahmed, Prakash Bisht, Ravi Mula, S. Dhavala
{"title":"A Deep Learning framework for Interoperable Machine Learning","authors":"Shakkeel Ahmed, Prakash Bisht, Ravi Mula, S. Dhavala","doi":"10.1145/3486001.3486243","DOIUrl":"https://doi.org/10.1145/3486001.3486243","url":null,"abstract":"In this paper, we introduce an opinionated, extensible, Python framework that transpiles variety of classical Statistical and Machine Learning models onto Open Neural Network Exchange (ONNX) format via an underlying Deep Learning model. We achieve this by exploiting the compositionality of Deep Learning technology. By appropriately choosing the features, architecture, loss functions, and regularizers, among others, the fidelity between the source model and the target model can be specified. Depending on the model being transpiled, the fidelity can be exact or approximate. We present the design details, APIs of the framework, reference implementations, road map for development, and guidelines for contributions. A reference implementation is available for the popular scikit-learn APIs.","PeriodicalId":266754,"journal":{"name":"Proceedings of the First International Conference on AI-ML Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127808157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Shikha, Manan Agrawal, Mohd Anwar, Divyashikha Sethia
{"title":"Stacked Sparse Autoencoder and Machine Learning Based Anxiety Classification Using EEG Signals","authors":"S. Shikha, Manan Agrawal, Mohd Anwar, Divyashikha Sethia","doi":"10.1145/3486001.3486227","DOIUrl":"https://doi.org/10.1145/3486001.3486227","url":null,"abstract":"Anxiety is an emotion characterized by trepidation, stress, or uneasiness that involves extreme worry or fear over future unwanted events or an actual situation. Careful analysis for anxiety is critical since approximately 2 to 4% of the general population have experienced adequate symptoms indicating an anxiety disorder. This paper aims to classify anxiety levels based on machine learning and deep learning algorithms with improved performance. This work uses the publically available DASPS Database (Database for Anxious States based on a Psychological Stimulation). The dataset consists of EEG recordings from 23 participants during anxiety elicitation through face-to-face psychological stimuli. This work uses RFECV with the classifiers to reduce redundancy between features and improve results. We achieve the highest classification accuracy of 83.93% and 70.25% using Stacked Sparse Autoencoder and Decision Tree for two-class anxiety classification.","PeriodicalId":266754,"journal":{"name":"Proceedings of the First International Conference on AI-ML Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115562933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Prem Kumar S. R., Vikas Kumar Anand, S. Senapati, A. Murali
{"title":"InuAid Chest X-rays AI Product: Respiratory diseases detection using Artificial Intelligence to significantly improve productivity and quality of diagnosis","authors":"Prem Kumar S. R., Vikas Kumar Anand, S. Senapati, A. Murali","doi":"10.1145/3486001.3486242","DOIUrl":"https://doi.org/10.1145/3486001.3486242","url":null,"abstract":"The spread of Covid-19 virus around the world has taken many lives, quarantined people and shattered many industries. Due to high transmissibility of the virus and its silent incubation period in human beings, detection of the virus plays an important role to control its spread and to plan diagnostic and preventive measures. Laboratory tests such as Polymerase Chain Reaction (PCR) take more time and hence there is a need for rapid and accurate diagnostic methods to detect the virus to prevent its spread and combat it. Today PCR tests were used for diagnosing purposes and the chest x-ray was only used as the follow up of patients, hence these studies on the chest x-rays of patients with Covid-19 pneumonia or any other disease are still limited to the literature and must be improved in the future. In this project, the goal is to build an application for healthcare workers to monitor the health of lungs using the chest x-ray images of patients. The algorithm must be very accurate because it deals with the lives of people. Here we used computer vision and deep learning techniques in this project. The focus is to classify chest x-ray images and segment the abnormal region and to get more insights on the images from the available datasets. The diagnostic accuracy is the challenging part and to increase the detection efficiency due to the limited open-source data available. The data was collected from the internet. On classification, the trained model was able to achieve 93.10% accuracy and F1 Score of 0.93 after using transfer learning technique with pneumonia images. On segmentation, the Intersection Over Union value was found to be 0.91 on the validation data.","PeriodicalId":266754,"journal":{"name":"Proceedings of the First International Conference on AI-ML Systems","volume":"3 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114132816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Saksham Sharma, Vanshika V Bhargava, Aditya Singh, K. Bhardwaj, Sparsh Mittal
{"title":"Leveraging Prediction Confidence For Versatile Optimizations to CNNs","authors":"Saksham Sharma, Vanshika V Bhargava, Aditya Singh, K. Bhardwaj, Sparsh Mittal","doi":"10.1145/3486001.3486222","DOIUrl":"https://doi.org/10.1145/3486001.3486222","url":null,"abstract":"Modern convolutional neural networks (CNNs) incur huge computational and energy overheads. In this paper, we propose two techniques for inferring the confidence in the correctness of a prediction in the early layers of a CNN. The first technique uses a statistical approach, whereas the second technique requires retraining. We argue that prediction confidence estimation can enable diverse optimizations to CNNs. We demonstrate two optimizations. First, we predict selected images in early layers. This is possible because in a dataset, many images are easy to predict and they can be predicted in the early layers of a CNN. This reduces the average computation count at the cost of accuracy and parameter count. Second, we propose predicting only selected images for which the prediction-confidence is high. This reduces the coverage; however, the accuracy on the images that are predicted is higher. Our results with VGG16 and ResNet50 CNNs on the Caltech256 dataset show that our techniques are effective. For example, for ResNet, our first technique reduces the accuracy from 71.6% to 69.8% while reducing the computations by 14%. Similarly, with the second technique, on reducing the coverage from 100% to 90%, the accuracy is increased from 71.6% to 75.6%. Keywords: computer vision, CNN, approximate computing, accuracy-coverage tradeoff, prediction confidence","PeriodicalId":266754,"journal":{"name":"Proceedings of the First International Conference on AI-ML Systems","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114371233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}