{"title":"Feature-Based Understanding of Human Emotions","authors":"Jonathon Moody, D. Jeong, Soo-Yeon Ji","doi":"10.1109/IRI.2019.00051","DOIUrl":"https://doi.org/10.1109/IRI.2019.00051","url":null,"abstract":"Since human emotion recognition is considered as one of the priority research topics in academia and industries to help people manage their stress and emotions, many significant research studies have been performed by proposing innovative techniques to recognize emotions. However, it is still difficult to understand the emotions. In this paper, we focused on analyzing the emotions computationally. In detail, a wavelet transform technique is utilized to extract significant features to find patterns in an emotion dataset. With the features, both classification and visual analysis are performed. For the classification, Logistic Regression, C4.5, and Support Vector Machine are used. Visualization techniques are utilized to show the similarity and difference among the emotion patterns. From the analysis, we found that there is an improvement in identifying the difference among the emotions.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125908951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Renovating Database Queries with Query AutoAwesome","authors":"Chetna Suryavanshi, C. Dyreson, Jonathan Adams","doi":"10.1109/IRI.2019.00048","DOIUrl":"https://doi.org/10.1109/IRI.2019.00048","url":null,"abstract":"Querying is an important database activity, typically occurring more frequently than update. Programmers invest time and effort into developing a set of queries. We propose reusing these queries with Query AutoAwesome (QAA). QAA is a system to automatically enhance queries. QAA was inspired by Google's Auto Awesome tool, which provides automatic enhancements of photos. QAA ingests a query, the database schema, and the data to enhance a query. QAA can reuse a query in several ways, such as re-placing a literal with a choice of alternatives. From among the pool of potential enhancements it is important to choose the top-k generated queries. To do so we introduce objective functions to measure the enhancement and rank potential queries.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132919261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nooshin Mojab, V. Noroozi, Philip S. Yu, J. Hallak
{"title":"Deep Multi-Task Learning for Interpretable Glaucoma Detection","authors":"Nooshin Mojab, V. Noroozi, Philip S. Yu, J. Hallak","doi":"10.1109/IRI.2019.00037","DOIUrl":"https://doi.org/10.1109/IRI.2019.00037","url":null,"abstract":"Glaucoma is one of the leading causes of blindness worldwide. The rising prevalence of glaucoma, with our aging population, increases the need to develop automated systems that can aid physicians in early detection, ultimately preventing vision loss. Clinical interpretability and adequately labeled data present two major challenges for existing deep learning algorithms for automated glaucoma screening. We propose an interpretable multi-task model for glaucoma detection, called Interpretable Glaucoma Detector (InterGD). InterGD is composed of two major complementary components, segmentation and prediction modules. The segmentation module addresses the lack of clinical interpretability by locating the optic disc and optic cup regions in a fundus image. The prediction module utilizes a larger dataset to improve the performance of the segmentation task and thus mitigate the problem of limited labeled data in a segmentation module. The two components are effectively integrated into a unified multi-task framework allowing end-to-end training. To the best of our knowledge, this work is the first to incorporate interpretability into glaucoma screening employing deep learning methods. The experiments on three datasets, two public and one private, demonstrate the effectiveness of InterGD in achieving interpretable results for glaucoma screening.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115930607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IRI 2019 Foreword","authors":"","doi":"10.1109/iri.2019.00005","DOIUrl":"https://doi.org/10.1109/iri.2019.00005","url":null,"abstract":"","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125442193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Temporal Relationships between ASD and Brain Activity through EEG and Machine Learning","authors":"Yasith Jayawardana, M. Jaime, S. Jayarathna","doi":"10.1109/IRI.2019.00035","DOIUrl":"https://doi.org/10.1109/IRI.2019.00035","url":null,"abstract":"Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder that impairs normative social cognitive and communicative function. Early diagnosis is crucial for the timely and efficacious treatment of ASD. The Autism Diagnostic Observation Schedule Second Edition (ADOS-2) is the current gold standard for diagnosing ASD. In this paper, we analyse the short-term and long-term relationships between ASD and brain activity using Electroencephalography (EEG) readings taken during the administration of ADOS-2. These readings were collected from 8 children diagnosed with ASD, and 9 low risk controls. We derive power spectrums for each electrode through frequency band decomposition and through wavelet transforms relative to a baseline, and generate two sets of training data that captures long-term and short-term trends respectively. We utilize machine learning models to predict the ASD diagnosis and the ADOS-2 scores, which provide an estimate for the presence of such trends. When evaluating short-term dependencies, we obtain a maximum of 56% accuracy of classification through linear models. Non-linear models provide a classification above 92% accuracy, and predicted ADOS-2 scores within an RMSE of 4. We use a CNN model to evaluate the long-term trends, and obtain a classification accuracy above 90%. Our findings have implications for using EEG as a non-invasive bio-marker for ASD with minimal feature manipulation and computational overhead.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115531495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Learning and Data Sampling with Imbalanced Big Data","authors":"Justin M. Johnson, T. Khoshgoftaar","doi":"10.1109/IRI.2019.00038","DOIUrl":"https://doi.org/10.1109/IRI.2019.00038","url":null,"abstract":"This study evaluates the use of deep learning and data sampling on a class-imbalanced Big Data problem, i.e. Medicare fraud detection. Medicare offers affordable health insurance to the elderly population and serves more than 15% of the United States population. To increase transparency and help reduce fraud, the Centers for Medicare and Medicaid Services (CMS) have made several data sets publicly available for analysis. Our research group has conducted several studies using CMS data and traditional machine learning algorithms (non-deep learning), but challenges associated with severe class imbalance leave room for improvement. These previous studies serve as baselines as we employ deep neural networks with various data-sampling techniques to determine the efficacy of deep learning in addressing class imbalance. Random over-sampling (ROS), random under-sampling (RUS), and combinations of the two (ROS-RUS) are applied to study how varying levels of class imbalance impact model training and performance. Classwise performance is maximized by identifying optimal decision thresholds, and a strong linear relationship between minority class size and optimal threshold is observed. Results show that ROS significantly outperforms RUS, combining RUS and ROS both maximizes performance and efficiency with a 4 x speedup in training time, and the default threshold of 0.5 is never optimal when training data is imbalanced. To the best of our knowledge, this is the first study to provide statistical results comparing ROS, RUS, and ROS-RUS deep learning methods across a range of class distributions. Additional contributions include a unique analysis of thresholding as it relates to the minority class size and state-of-the-art performance on the given fraud detection task.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122047907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Ensemble Machine Learning Method for Single and Clustered Cervical Cell Classification","authors":"Mohammed Kuko, M. Pourhomayoun","doi":"10.1109/IRI.2019.00043","DOIUrl":"https://doi.org/10.1109/IRI.2019.00043","url":null,"abstract":"Cervical Cancer was in recent history a major cause of death for women of childbearing age. This changed when in the 1950s the Papanicolaou (Pap smear) test was introduced to identify and diagnose cervical cancer in its infancy. The introduction of the Pap smear test dropped cervical cancer related deaths by 60% but still approximately 4,210 women die from cervical cancer in the United State annually. The goal of our research is to aid in the methods of identifying and classifying cervical cancer used in the Pap smear or Liquid-based Cytology (LBC) with cutting edge machine vision, and ensemble learning techniques. The contribution of this research is to develop an automated Pap smear screening system that identifies cells within a cervical cell slide sample and classify cells and clusters of cells as abnormal or normal as defined by the Bethesda System for reporting cervical cytology. Achieving an accuracy of 90.4% when evaluated with a five-fold cross-validation demonstrates promise in the creation of an automated Pap smear screening test.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133075724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IRI 2019 Keynotes","authors":"","doi":"10.1109/iri.2019.00012","DOIUrl":"https://doi.org/10.1109/iri.2019.00012","url":null,"abstract":"","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"229 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132714236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Preliminary Study of Relationship between Health Behavior and Breast Cancer","authors":"Kuo-Chung Chu, Min-Yang Xiao, C. Chang, Cheng-Hsiang Hsiao, Yan-Chen Jiang, Po-Yao Tsai","doi":"10.1109/IRI.2019.00069","DOIUrl":"https://doi.org/10.1109/IRI.2019.00069","url":null,"abstract":"In Taiwan, breast cancer is the leading cancer in women; every year, more than 10,000 women are newly diagnosed with breast cancer, and more than 2,000 women die of it. Some cancer research centers and academic research have previously indicated that breast cancer risk factors include tobacco exposure, alcoholic beverage consumption habits, soft drink beverage consumption habits, radiation exposure, work environment, dietary habits, aging, family genetics, obesity, the use of Diethylstilbestrol (DES), etc. This research integrates the open data of various fields to analyze and discuss breast cancer risk factors, and the results can be used as a reference for future clinical research, as well as provided to relevant government departments and medical institutions for breast cancer prevention and advocacy.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"34 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132762538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated Neural Network Construction with Similarity Sensitive Evolutionary Algorithms","authors":"Haiman Tian, Shu‐Ching Chen, M. Shyu, S. Rubin","doi":"10.1109/IRI.2019.00052","DOIUrl":"https://doi.org/10.1109/IRI.2019.00052","url":null,"abstract":"Deep learning has been successfully applied to a wide variety of tasks. It generates reusable knowledge that allows transfer learning to significantly impact more scientific research areas. However, there is no automatic way to build a new model that guarantees an adequate performance. In this paper, we propose an automated neural network construction framework to overcome the limitations found in current approaches using transfer learning. Currently, researchers spend much time and effort to understand the characteristics of the data when designing a new network model. Therefore, the proposed method leverages the strength in evolutionary algorithms to automate the search and optimization process. Similarities between the individuals are also considered during the cycled evolutionary process to avoid sticking to a local optimal. Overall, the experimental results effectively reach optimal solutions proving that a time-consuming task could also be done by an automated process that exceeds the human ability to select the best hyperparameters.","PeriodicalId":295028,"journal":{"name":"2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116236932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}