{"title":"Performance Analysis of Machine Learning Algorithms for Disease Prediction","authors":"B. Priya, C. Chaitra, K. Reddy","doi":"10.1109/GHCI50508.2021.9514000","DOIUrl":"https://doi.org/10.1109/GHCI50508.2021.9514000","url":null,"abstract":"With the recent technological advances in microelectronics, wireless communication, machine learning (ML), and decision-making process, Wireless Body Area Network (WBAN) has become the most promising technology. As we all know that we are in global pandemic due to Covid-19 situation now, hence, there is a demand occurring in health care services and continuous monitoring. Moreover, prediction of abnormalities at an early stage will be crucial for a person in diagnosis. Hence, in this paper we have developed and compared the performance of three machine learning algorithms such as Decision Tree Classifier (DTC), K-Nearest Neighbor (KNN), and Random Forest (RF). Each algorithm is tested with datasets of 100, 200, 500 & 1000 users respectively. Further, threshold values have been identified by consulting with doctors for accurate disease prediction based on the vital signals collected by various sensors. The three algorithms used are based on supervised learning, where the output is predicted based on the training of the developed classifier. From the results, it is observed that the accuracy in disease prediction using RF is 0.99 & outperformed when compared with state of the art for datasets of 1000 users.","PeriodicalId":378325,"journal":{"name":"2021 Grace Hopper Celebration India (GHCI)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126898554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nidhi, D. Ghosh, Dharmendra Chaurasia, Saurav Mondal, Asmita Mahajan
{"title":"Handwritten Documents Text Recognition with Novel Pre-processing and Deep Learning","authors":"Nidhi, D. Ghosh, Dharmendra Chaurasia, Saurav Mondal, Asmita Mahajan","doi":"10.1109/GHCI50508.2021.9514054","DOIUrl":"https://doi.org/10.1109/GHCI50508.2021.9514054","url":null,"abstract":"Data being the most valuable resource on earth today, there is a pressing need to transform all data digitally. The world has been historically and still in multiple sectors operating based on handwritten text. For example, Handwritten taxation data and calculations, claim forms, doctor’s prescription, legal documents, resumes, financial documents, accounts sheets, and many more. Handwritten data can be very easily misinterpreted even by human personnel. Hence, the most critical challenge that remains is the transformation of handwritten documents. Researches from different authors present an application wherein they take in word crop images generated from a document manually and extract the text from it. In our thesis, we present an end to end solution wherein a text image is uploaded, and handwritten text from the entire image is extracted as-is. The three main contributions of our thesis are 1. Improved text localizer trained on the various handwritten and printed dataset, 2. Novel classification model to segregate printed and handwritten words, 3. Effective image preprocessing techniques applied to handwritten word crops to make them eligible to be fed to the deep learning model for improved overall accuracy.","PeriodicalId":378325,"journal":{"name":"2021 Grace Hopper Celebration India (GHCI)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124414385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stress Analysis for Students in Online Classes","authors":"Chhavi Sharma, Pranjal Saxena","doi":"10.1109/GHCI50508.2021.9514059","DOIUrl":"https://doi.org/10.1109/GHCI50508.2021.9514059","url":null,"abstract":"This paper aims to identify the stress levels of students in Massive Open Online Courses(MOOCs). Research shows that there is a lack of sentiment analysis for online classes and hence a higher attrition rate. We thus aim to help instructors identify the stressed students. Using student posts from online platform “Piazza” as input, we perform various stress detection analysis methods like Naive Bayes, ANEW, VADER and SentiWords. These stressed posts from each method are extracted to compare accuracy with baseline dataset. This research provides unique solutions to detect the student sentiment in formal environment which can help reduce stress and improve the students’ overall performance.","PeriodicalId":378325,"journal":{"name":"2021 Grace Hopper Celebration India (GHCI)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132971512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Template-based NLG for tabular data using BERT","authors":"Srushti Gajbhiye, M. Lopes","doi":"10.1109/GHCI50508.2021.9514032","DOIUrl":"https://doi.org/10.1109/GHCI50508.2021.9514032","url":null,"abstract":"With the data size growing exponentially, machines need to be well-equipped to understand all kinds of data. Tabular content is preferred over textual content by humans as it presents inter-related data in a simplified way. Humans are also able to co-relate two or more tables with each other, even when it is not explicitly stated. Machines lack both of these abilities, making it taxing to work directly with tables. This paper proposes an approach to summarize tabular data from PDF documents and convert it to textual content as is better suited for machine consumption. The generated content delivers insights to humans and minimizes redundant efforts. We have tested our hypothesis on financial credit notes with promising results attesting to its applicability in PDF documents having tables of various formats.","PeriodicalId":378325,"journal":{"name":"2021 Grace Hopper Celebration India (GHCI)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121196203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated Test Scheduling using Adaptive Hybrid Prediction Technique","authors":"Sarmishta Sarangarajan, B. Sai Shruthi","doi":"10.1109/GHCI50508.2021.9514046","DOIUrl":"https://doi.org/10.1109/GHCI50508.2021.9514046","url":null,"abstract":"In a typical data center, there is always an ongoing need to isolate faulty components. Diagnostic and Regression tests are generally tending towards automation. However, the diagnostic tools also put the underlying hardware to various levels of stress. It is challenging to select appropriate test tools and schedule them in such a way that they can help uncover maximum defects and ensure minimal disturbance to the live customer setup. In this paper, we propose a technique to automatically schedule tests on a target system with minimal disturbance to the workload using a real-time adaptive hybrid predictive model. The models we use are trained to predict resource utilization in a fast and accurate manner. This solution enhances the critical decision-making ability of an admin by scheduling the regression or diagnostic tests accurately. Schedules are recommended based on actual resource utilization and are spaced out at intervals when low resource utilization is predicted. This ensures minimal downtime for maintenance and helps meet customer SLA. We also propose a technique to automate the selection process of the best fit time-series model based on analysis of data, which in due course would reduce the prediction overhead by half. This solution can work alongside any existing management framework or can be designed as a standalone tool.","PeriodicalId":378325,"journal":{"name":"2021 Grace Hopper Celebration India (GHCI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130008170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Music genre classification using multi-modal deep learning based fusion","authors":"Laisha Wadhwa, Prerana Mukherjee","doi":"10.1109/GHCI50508.2021.9514020","DOIUrl":"https://doi.org/10.1109/GHCI50508.2021.9514020","url":null,"abstract":"Music genre classification is extensively used in almost all music streaming applications and websites. Most of them use it either to recommend playlists to their customers (such as Spotify, Soundcloud) or simply as a product (e.g. Shazam and MusixMatch). In this paper, we present a novel approach to classify a given song by encoding both textual and music features. The contribution of this work is twofold, i) We propose a multi modal fusion network approach which enables music genre classification utilizing both the textual features (lyrics) and musical features (mel spectrogram) achieving an accuracy of 90.4%. ii) We also propose a multiframe convolutional recurrent neural network (CRNN) based classifier that uses K-nearest neighbor approach over the predictions of every frame to predict the genre of a given song. In multi-modal fusion approach, we utilize co-attention between the textual and musical features for training classification network. The advantage of CRNN based multi frame approach is that it not only enriches the classification process but also enables to generate more training data from a smaller number of music files and thus helps in data augmentation. Our models and code are available on https://github.com/laishawadhwa/Multi-modal-music-genre-classification.","PeriodicalId":378325,"journal":{"name":"2021 Grace Hopper Celebration India (GHCI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134603328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementation and Analysis of Supervised Learning methods for Bugs Classification","authors":"Bhagyashree M Katti","doi":"10.1109/GHCI50508.2021.9513994","DOIUrl":"https://doi.org/10.1109/GHCI50508.2021.9513994","url":null,"abstract":"Classification of bugs or logs is a vital aspect in the development of high-quality products. Early detection of errors, frequent patterns showing anomalies and, timely rectification of errors reduces the risk of developing faulty software. The aim of proposed system is to integrate a machine learning based intelligent layer to the existing Automation framework. The focus of this work is to classify the logs by extracting the underlying error messages in logs and hence ease the work of developers to save time invested in analyzing the logs. The data set includes logs collected from the Automation framework that were reported during automation runs in the last few months. This paper aims at finding the optimal algorithm to classify the bugs. In this experiment, we examine Naïve Bayes, Multilayer perceptron, CNN, and a hybrid model with a combination of CNN + Naïve Bayes Algorithm along with feature extraction techniques. It is observed from the experiment, that the Multilayer perceptron network has the highest accuracy of 91 percent. This experiment shows that our proposed system can classify logs effectively into different class of failure-types.","PeriodicalId":378325,"journal":{"name":"2021 Grace Hopper Celebration India (GHCI)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116070301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"YOLO as a Region Proposal Network for Diagnosing Breast Cancer","authors":"Ananya Bal, M. Das, Shashank Mouli Satapathy","doi":"10.1109/GHCI50508.2021.9513988","DOIUrl":"https://doi.org/10.1109/GHCI50508.2021.9513988","url":null,"abstract":"Cytological images of various types are increasingly being classified with the use of neural networks. But deep learning-based image classification systems are heavily reliant on manually sampled RoI (Region of Interest) patches. A lot of time and effort are required to extract RoI patches from whole slide images or larger images that are too complex to be processed by neural networks. A region proposal network (RPN) is an efficient way to automate the extraction of RoIs. In this study, we have proposed the use of the YOLOv3 network as an RPN to suggest RoIs in images from fine needle aspiration cytology of breast tissue. Patches from the suggested RoIs are fed into a Convolutional Neural Network (CNN) for the classification of benign and malignant lesions and ultimately, the diagnosis of Ductal Carcinoma in breast. The YOLO+CNN model yields a highly satisfactory classification accuracy of 95.73%, 100% specificity, 92.4% sensitivity and a precision score of 1.","PeriodicalId":378325,"journal":{"name":"2021 Grace Hopper Celebration India (GHCI)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125071008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}