{"title":"Large-Scale Entity Extraction from Enterprise Data","authors":"Rajeev Gupta, Ranganath Kondapally","doi":"10.1145/3564121.3564818","DOIUrl":"https://doi.org/10.1145/3564121.3564818","url":null,"abstract":"Adoption of cloud computing by enterprises has exploded in the last decade and most of the applications used by enterprise users have moved to the cloud. These applications include collaboration software(e.g., Word, Excel), instant messaging (e.g., Chat), asynchronous communication (e.g., Email), etc. This has resulted in an exponential increase in the volume of data arising from the interactions of the users with the online applications (such as documents edited, people interacted with, meetings attended, etc.). Activities of a user provide strong insights about her such as meetings attended by the user indicate the set of people the user closely works with and documents edited indicate the topics the user works on, etc. Typically, this data is private and confidential for the enterprise, part of the enterprise, or the individual employee. To provide better experience and assist employees in their activities, it is critical to mine certain entities from this data. In this tutorial, we explain various entities which can be extracted from the enterprise data and assist the employees in their productivity. Specifically, we define and extract various enterprise entities such as tasks, commitments, calendar activity, acronyms, topics, definitions, etc. These entities are extracted using different techniques—tasks and commitments are extracted using intent mining techniques (e.g., sentiment extraction), definitions are extracted using sequence mining techniques, calendars are updated using the user’s flight/hotel booking entities, etc. The entity extraction from enterprise data poses interesting and complex challenge from scalable information extraction point of view: building information extraction models where there is little data to learn from due to privacy and access-control constraints but need highly accurate models to run on a large amount of diverse data from whole of the enterprise. Specifically, we need to overcome the following challenges: Privacy: For legal and trust reasons, individual user’s data should be accessible only to the persons who it is intended to. Thus, we can’t directly apply the openly available techniques used to mine these entities which all require labeled data. Efficiency: As enterprises need to process billions of emails, chats, and other documents every day—different for different users—extraction models need to be very efficient. Scalability: There are a large number of variations in the way information is presented in the enterprise documents. For example, a flight itinerary is represented in different ways by different providers. Definition of the same topic can be expressed differently in different documents. We should be able to extract entities irrespective of the way it is presented in the documents. Multi-lingual: Users are located across geographies, and hence, the information extraction needs to be done across multiple languages. To extract these entities, one needs supervised data. How to get labeled data ","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129738177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sri Charan Kattamuru, Kshitij Agrawal, S. Adhikari, Abhishek Bose, Hemant Misra
{"title":"Patch-wise Features for Blur Image Classification","authors":"Sri Charan Kattamuru, Kshitij Agrawal, S. Adhikari, Abhishek Bose, Hemant Misra","doi":"10.1145/3564121.3564138","DOIUrl":"https://doi.org/10.1145/3564121.3564138","url":null,"abstract":"Images captured through smartphone cameras often suffer from degradation, blur being one of the major ones, posing a challenge in processing these images for downstream tasks. In this paper we propose low-compute lightweight patch-wise features for image quality assessment. Using our method we can discriminate between blur vs sharp image degradation. To this end, we train a decision-tree-based XGBoost model on various intuitive image features like gray level variance, first and second order gradients, texture features like local binary patterns. Experiments conducted on an open dataset show that the proposed low compute method results in 90.1% mean accuracy on the validation set, which is comparable to the accuracy of a compute-intensive VGG16 network with 94% mean accuracy fine-tuned to this task. To demonstrate the generalizability of our proposed features and model we test the model on BHBID dataset and an internal dataset where we attain accuracy of 98% and 91%, respectively. The proposed method is 10x faster than the VGG16 based model on CPU and scales linearly to the input image size making it suitable to be implemented on low compute edge devices.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115639833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accurate and Efficient Channel pruning via Orthogonal Matching Pursuit","authors":"Kiran Purohit, Anurag Parvathgari, Soumili Das, Sourangshu Bhattacharya","doi":"10.1145/3564121.3564139","DOIUrl":"https://doi.org/10.1145/3564121.3564139","url":null,"abstract":"The deeper and wider architectures of recent convolutional neural networks (CNN) are responsible for superior performance in computer vision tasks. However, they also come with an enormous model size and heavy computational cost. Filter pruning (FP) is one of the methods applied to CNNs for compression and acceleration. Various techniques have been recently proposed for filter pruning. We address the limitation of the existing state-of-the-art method and motivate our setup. We develop a novel method for filter selection using sparse approximation of filter weights. We propose an orthogonal matching pursuit (OMP) based algorithm for filter pruning (called FP-OMP). We also propose FP-OMP Search, which address the problem of removal of uniform number of filters from all the layers of a network. FP-OMP Search performs a search over all the layers with a given batch size of filter removal. We evaluate both FP-OMP and FP-OMP Search on benchmark datasets using standard ResNet architectures. Experimental results indicate that FP-OMP Search consistently outperforms the baseline method (LRF) by nearly . We demonstrate both empirically and visually, that FP-OMP Search prunes different number of filters from different layers. Further, timing profile experiments show that FP-OMP improves over the running time of LRF.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117207241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Siva Mouni Nemalidinne, Pavan Reddy Manne, Abhinav Kumar, P. K. Upadhyay
{"title":"DNN based Adaptive User Pairing and Power Allocation to achieve α-Fairness in NOMA Systems with Imperfections in SIC","authors":"Siva Mouni Nemalidinne, Pavan Reddy Manne, Abhinav Kumar, P. K. Upadhyay","doi":"10.1145/3564121.3565042","DOIUrl":"https://doi.org/10.1145/3564121.3565042","url":null,"abstract":"Non-orthogonal multiple access (NOMA) technology aided with successive interference cancellation (SIC) is expected to achieve multi-fold improvements in the network capacity. However, the SIC in practice is prone to imperfections and this degrades the achievable gains with NOMA. Additionally, inappropriate user pairing and power allocation in NOMA can adversely affect the fairness between paired users. Hence, the impact of imperfections in SIC and fairness should be considered for user pairing and power allocation in NOMA. Motivated by this, we formulate the user pairing and power allocation to achieve α-fairness among the paired users as an optimization problem. To obtain a feasible solution in practice, we then propose a two-step machine learning-based approach to solve the problem. We use a random forest classifier (RFC) to establish a pairing criterion and a deep neural network (DNN) to allocate the power factors to the NOMA pair. The performance of the proposed supervised learning (SL) models is extensively evaluated and compared with other pre-existing algorithms. We analyze the performance of DNN for varying number of neurons in the hidden layer by considering different activation functions. We show that with 4 neurons in the hidden layer and sigmoid activation function, the trained DNN network outperforms the existing SL algorithms. We then use the trained network and perform Monte-Carlo simulations to quantify the achievable gains. We show that the proposed approach achieves an excellent solution that maximizes fairness and also ensures minimum required data rates for each user. Through extensive numerical evaluations, we show that our proposed two-step machine learning approach outperforms various state-of-the-art algorithms.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121601738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Imlijungla Longchar, Amey Varhade, Chetan Ingle, Saurabh Baranwal, H. Kapoor
{"title":"CluSpa: Computation Reduction in CNN Inference by exploiting Clustering and Sparsity","authors":"Imlijungla Longchar, Amey Varhade, Chetan Ingle, Saurabh Baranwal, H. Kapoor","doi":"10.1145/3564121.3564132","DOIUrl":"https://doi.org/10.1145/3564121.3564132","url":null,"abstract":"Convolutional Neural Networks (CNNs) have grown in popularity and usage tremendously over the last few years, spanning across different task such as computer vision tasks, natural language processing, video recognition, and recommender systems. Despite the algorithmic advancements that drove the growth of CNN still has considerable computational and memory overhead that poses challenges in achieving real-time performance. Each input image requires millions to even billions of elementary arithmetic operations before the network obtains the result. In CNNs, convolutional and pooling layers are followed by activation layers involving various activation functions. Hence, a lot of work has been done to reduce these costs in the last few years. Numerous optimizations have addressed at both hardware and software levels. In this paper, we propose a software-based solution for improving the performance of inference of networks. We suggest a technique for the approximate computation of the convolution operation based on clustering and sharing of weights. We have utilized Gaussian Mixture Models for clustering. We exploit weight sparsity to further reduce computations on top of the clustering method. We were able to achieve a considerable reduction in the MAC operations and the overall computation speedup on popular CNN architectures","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115153883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Vector Store System for Python using Shared Memory","authors":"Dhruv Patel, S. Pandey, Abhishek Sharma","doi":"10.1145/3564121.3564799","DOIUrl":"https://doi.org/10.1145/3564121.3564799","url":null,"abstract":"Many e-commerce companies use machine learning to make customer experience better. Even within a single company, there will be generally many independent services running, each specializing in some aspect of customer experience. Since machine learning models work on abstract vectors representing users and/or items, each such service needs a way to store these vectors. A common approach nowadays is to save them in in-memory caches like Memcached. As these caches run in their own processes, and Machine Learning services generally run as Python services, there is a communication overhead involved for each request that ML service serves. One can reduce this overhead by directly storing these vectors in a Python dictionary within the service. To support concurrency and scale, a single node runs multiple instances of the same service. Thus, we also want to avoid duplicating these vectors across multiple processes. In this paper, we propose a system to store vectors in shared memory and efficiently serve all concurrent instances of the service, without replicating the vectors themselves. We achieve up to 4.5x improvements in latency compared to Memcached. Additionally, due to availability of more memory, we can increase the number of server processes running in each node, translating into greater throughput. We also discuss the impact of the proposed method (towards increasing the throughput) in live production scenario.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123190322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kaushik Roy, Usha Lokala, Manas Gaur, Amit P. Sheth
{"title":"Tutorial: Neuro-symbolic AI for Mental Healthcare","authors":"Kaushik Roy, Usha Lokala, Manas Gaur, Amit P. Sheth","doi":"10.1145/3564121.3564817","DOIUrl":"https://doi.org/10.1145/3564121.3564817","url":null,"abstract":"Artificial Intelligence (AI) systems for mental healthcare (MHCare) have been ever-growing after realizing the importance of early interventions for patients with chronic mental health (MH) conditions. Social media (SocMedia) emerged as the go-to platform for supporting patients seeking MHCare. The creation of peer-support groups without social stigma has resulted in patients transitioning from clinical settings to SocMedia supported interactions for quick help. Researchers started exploring SocMedia content in search of cues that showcase correlation or causation between different MH conditions to design better interventional strategies. User-level Classification-based AI systems were designed to leverage diverse SocMedia data from various MH conditions, to predict MH conditions. Subsequently, researchers created classification schemes to measure the severity of each MH condition. Such ad-hoc schemes, engineered features, and models not only require a large amount of data but fail to allow clinically acceptable and explainable reasoning over the outcomes. To improve Neural-AI for MHCare, infusion of clinical symbolic knowledge that clinicans use in decision making is required. An impactful use case of Neural-AI systems in MH is conversational systems. These systems require coordination between classification and generation to facilitate humanistic conversation in conversational agents (CA). Current CAs with deep language models lack factual correctness, medical relevance, and safety in their generations, which intertwine with unexplainable statistical classification techniques. This lecture-style tutorial will demonstrate our investigations into Neuro-symbolic methods of infusing clinical knowledge to improve the outcomes of Neural-AI systems to improve interventions for MHCare:(a) We will discuss the use of diverse clinical knowledge in creating specialized datasets to train Neural-AI systems effectively. (b) Patients with cardiovascular disease express MH symptoms differently based on gender differences. We will show that knowledge-infused Neural-AI systems can identify gender-specific MH symptoms in such patients. (c) We will describe strategies for infusing clinical process knowledge as heuristics and constraints to improve language models in generating relevant questions and responses.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126691301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Graph based Recommender System with Weighted Averaging of Messages","authors":"Faizan Ahemad","doi":"10.1145/3564121.3564127","DOIUrl":"https://doi.org/10.1145/3564121.3564127","url":null,"abstract":"We showcase a novel solution to a recommendation system problem where we face a perpetual soft item cold start issue. Our system aims to recommend demanded products to prospective sellers for listing in Amazon stores. These products always have only few interactions thereby giving rise to a perpetual soft item cold start situation. Modern collaborative filtering methods solve cold start using content attributes and exploit the existing implicit signals from warm start items. This approach fails in our use-case since our entire item set faces cold start issue always. Our Product Graph has over 500 Million nodes and over 5 Billion edges which makes training and inference using modern graph algorithms very compute intensive. To overcome these challenges we propose a system which reduces the dataset size and employs an improved modelling technique to reduce storage and compute without loss in performance. Particularly, we reduce our graph size using a filtering technique and then exploit this reduced product graph using Weighted Averaging of Messages over Layers (WAML) algorithm. WAML simplifies training on large graphs and improves over previous methods by reducing compute time to of LightGCN [8] and of Graph Attention Network (GAT) [20] and increasing recall@100 by over LightGCN and over GAT.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115668040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised Early Exit in DNNs with Multiple Exits","authors":"U. HariNarayanN, M. Hanawal, Avinash Bhardwaj","doi":"10.1145/3564121.3564137","DOIUrl":"https://doi.org/10.1145/3564121.3564137","url":null,"abstract":"Deep Neural Networks (DNNs) are generally designed as sequentially cascaded differentiable blocks/layers with a prediction module connected only to its last layer. DNNs can be attached with prediction modules at multiple points along the backbone where inference can stop at an intermediary stage without passing through all the modules. The last exit point may offer a better prediction error but also involves more computational resources and latency. An exit point that is ‘optimal’ in terms of both prediction error and cost is desirable. The optimal exit point may depend on the latent distribution of the tasks and may change from one task type to another. During neural inference, the ground truth of instances may not be available and the error rates at each exit point cannot be estimated. Hence one is faced with the problem of selecting the optimal exit in an unsupervised setting. Prior works tackled this problem in an offline supervised setting assuming that enough labeled data is available to estimate the error rate at each exit point and tune the parameters for better accuracy. However, pre-trained DNNs are often deployed in new domains for which a large amount of ground truth may not be available. We thus model the problem of exit selection as an unsupervised online learning problem and leverage the bandit theory to identify the optimal exit point. Specifically, we focus on the Elastic BERT, a pre-trained multi-exit DNN to demonstrate that it ‘nearly’ satisfies the Strong Dominance (SD) property making it possible to learn the optimal exit in an online setup without knowing the ground truth labels. We develop upper confidence bound (UCB) based algorithm named UEE-UCB that provably achieves sub-linear regret under the SD property. Thus our method provides a means to adaptively learn domain-specific optimal exit points in multi-exit DNNs. We empirically validate our algorithm on IMDb and Yelp datasets.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124561232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the Second International Conference on AI-ML Systems","authors":"","doi":"10.1145/3564121","DOIUrl":"https://doi.org/10.1145/3564121","url":null,"abstract":"","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128522657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}