{"title":"Development of Mental Model in Understanding Users' Thought Processes for the Evaluation and Functional Enhancement of Clinical Decision Support Systems","authors":"I. Agboola","doi":"10.1109/ICDMW.2018.00214","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00214","url":null,"abstract":"Clinical Decision Support Systems (CDSSs) are developed to help experts in making decisions; however, these systems tend to go out of date quickly due to changing needs of the users or simply because these systems are not user centred. The users' mental model which reflects their view of the system is considered to be a major starting point for system evaluation and upgrade. The users' thought processes of how users' interact with the system is a major point in understanding the processes or the navigational paths users' undergo in achieving the goal of using the system. This is considered to be a major starting point for system evaluation and upgrade. This research examines the elicitation of users' mental models using Galatea Risk and Safety Tool (GRiST); a web based mental health decision support system. This is to gain a better understanding of how the system is used in the workplace and to identify usability issues resulting from any mismatch or gap between the conceptual model of the system and the mental model of the users which may infer decision making of the system. A novel framework that aims to mine users' perspectives of the system is proposed. The framework further explores the data gotten to help in the system evaluation and re-design. The methodology used involve the process of monitoring users' interactions, mining the data, con-structing mental models using repertory grid technique and concept mapping to understand users' understanding of the system and to analyse the findings in investigating and resolving usability and system's functionalities issues.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115403413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chih-Hung Wu, Chih-Chaing Lu, Yu-Feng Ma, Ruei-Shan Lu
{"title":"A New Forecasting Framework for Bitcoin Price with LSTM","authors":"Chih-Hung Wu, Chih-Chaing Lu, Yu-Feng Ma, Ruei-Shan Lu","doi":"10.1109/ICDMW.2018.00032","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00032","url":null,"abstract":"Long short-term memory (LSTM) networks are a state-of-the-art sequence learning in deep learning for time series forecasting. However, less study applied to financial time series forecasting especially in cryptocurrency prediction. Therefore, we propose a new forecasting framework with LSTM model to forecasting bitcoin daily price with two various LSTM models (conventional LSTM model and LSTM with AR(2) model). The performance of the proposed models are evaluated using daily bitcoin price data during 2018/1/1 to 2018/7/28 in total 208 records. The results confirmed the excellent forecasting accuracy of the proposed model with AR(2). The test mean squared error (MSE), root mean square error (RMSE), mean absolute percentage error (MAPE), and mean absolute error (MAE) for bitcoin price prediction, respectively. The our proposed LSTM with AR(2) model outperformed than conventional LSTM model. The contribution of this study is providing a new forecasting framework for bitcoin price prediction can overcome and improve the problem of input variables selection in LSTM without strict assumptions of data assumption. The results revealed its possible applicability in various cryptocurrencies prediction, industry instances such as medical data or financial time-series data.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114931871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chenxu Gong, Guoyin Wang, Jun Hu, Ming Liu, Li Liu, Zihe Yang
{"title":"Finding Multi-granularity Community Structures in Social Networks Based on Significance of Community Partition","authors":"Chenxu Gong, Guoyin Wang, Jun Hu, Ming Liu, Li Liu, Zihe Yang","doi":"10.1109/ICDMW.2018.00068","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00068","url":null,"abstract":"Community structure detection is an important and valuable task in social network studies as it is the base for many social network applications such as link prediction, recommendation, etc. Most social networks have an inherent multi-granular structure, which leads to different community structures at different granularities. However, few studies pay attention to such multi-granular characteristics of social networks. In this paper, a method called MGCD (Multi-Granularity Community Detection) is proposed for finding multi-granularity community structures of social networks. At first, a network embedding method is used to obtain the low-dimensional vector representation for each node. Then, an effective embedding-based strategy for weakening the detected community structures is proposed. Finally, a joint learning framework, which combines network embedding and community structure weakening is developed for identifying the multi-granularity community structures of social networks. Experimental results on real-world networks show that MGCD outperforms the state-of-the-art benchmark methods on finding multi-granularity community structure tasks.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116577716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting Student at Risk of Failure: A Case Study of Conceptualizing Mining from Internet Access Log Files","authors":"Ruangsak Trakunphutthirak, Y. Cheung, V. Lee","doi":"10.1109/ICDMW.2018.00060","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00060","url":null,"abstract":"Predicting student academic performance can be done by using educational data mining. Machine learning techniques play an important role for predicting academic performance from the large-scale data like the internet access log files from a university. Current data sources are mainly manual collections of data or data from a single unit of study. This study highlights the use of a new data source by transforming a university log file to predict academic performance. The log file comprises student internet access activities and browsing categories. To detect overall student academic performance, we select the best prediction accuracy by enhancing two datasets and comparing different weights in the time and frequency domains. We found that the random forest technique provides the best way in these datasets to predict students at risk-of-failure. We also found that data from internet access activities reveals a better accuracy than data from browsing categories. The combination of two datasets reveals a better picture of students' internet utilization and thus indicates how students at risk-of-failure can be detected by their internet access activities and browsing behavior.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123927175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dante Chakravorti, Kathleen Law, Jonathan F. Gemmell, D. Raicu
{"title":"Detecting and Characterizing Trends in Online Mental Health Discussions","authors":"Dante Chakravorti, Kathleen Law, Jonathan F. Gemmell, D. Raicu","doi":"10.1109/ICDMW.2018.00107","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00107","url":null,"abstract":"Mental illness is a widespread public health concern that affects many individuals on a daily basis. Increasingly, people are turning to social media to discuss their mental health. The result is a rich dataset of authentic discussions from which to draw insights. In this work, we collected data from multiple mental health forums on the popular social media website, Reddit. We extracted topics from these datasets and then observed the trends of these topics from 2012 to 2018. These trends fall into many recognizable patterns. Some trends are stable, often using common words found in mental health conversations. Other trends are increasing or decreasing. In this work, we found that topics with positive words are becoming less frequently used and topics with negative connotations are becoming more frequently used. Other trends display a periodic pattern, like those associated with the school year. One trend demonstrates a sudden shift in conversation possibly due to a change in how the site is administered. We confirm these qualitative observations through quantitative analysis with statistical tests.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114316117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extracting Addresses from Unstructured Text Using Bi-directional Recurrent Neural Networks","authors":"Shivin Srivastava","doi":"10.1109/ICDMW.2018.00223","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00223","url":null,"abstract":"Addresses can be classified as unstructured text because they lack meta-information to be directly indexed in databases. Still they demonstrate an internal structure which can used to automatically extract them using machine learning techniques. In this work we describe a machine learning approach to identify addresses in unstructured text (like blogs) using Bidirectional Recurrent Neural Networks (BRNNs). We overcome the problem of lack of training data by generating synthetic free text entries and come up with problem specific features. Our system does not impose any strict condition on the structure or style of addresses leading to many applications in real life.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121626779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Overlapping Toxic Sentiment Classification Using Deep Neural Architectures","authors":"Hafiz Hassaan Saeed, K. Shahzad, F. Kamiran","doi":"10.1109/ICDMW.2018.00193","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00193","url":null,"abstract":"We are living in an era where data is enjoying an unprecedented increase in its volume in each passing moment through online media platforms. Such a colossal amount of data is multifarious in its nature where textual data proves to be its vital pillar. Almost every sort of online media platform is producing textual data. Short posts (i.e. Twitter and Facebook) and comments constitute a significant part of this textual data. Unfortunately, this text data may contain overlapping toxic sentiments in terms of personal attacks, abuses, obscenity, insults, threats or identity hatred. In many cases, it becomes extremely important to track such toxic posts/data to trigger needed actions e.g. automated tagging of posts as inappropriate. State-of-the-art classification techniques do not handle the overlapping sentiment categories of text data. In this paper, we propose Deep Neural Network (DNN) architectures to classify the overlapping sentiments with high accuracy. Moreover, we show that our proposed classification framework does not require any laborious text pre-processing and is capable of handling text pre-processing (e.g. stop word removal, feature engineering, etc.) intrinsically. Our empirical validation on a real world dataset supports our claims by showing the superior performance of the proposed methods.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132407989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning the Group Structure of Deep Neural Networks with an Expectation Maximization Method","authors":"Subin Yi, Jaesik Choi","doi":"10.1109/ICDMW.2018.00106","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00106","url":null,"abstract":"Many recent deep learning research work use very deep neural networks exploiting huge amount of parameters. It results in the strong expressive power, however, it also brings issues such as overfitting to training data, increasing memory burden and requiring excessive computations. In this paper, we propose an expectation maximization method to learn the group structure of deep neural networks with a group regularization principle to resolve those issues. Our method clusters the neurons in a layer based on how they are connected to the neurons in the next layer using a mixture model and the neurons in the next layer based on which group in the current layer they are most strongly connected to. Our expectation maximization method uses the Gaussian mixture model to keep the most salient connections and remove others to acquire a grouped weight matrix in a block diagonal matrix form. We refine our method further to cluster the kernels of convolutional neural networks (CNNs). We define the representative value of each kernel and build a representative matrix. The matrix is then grouped and the kernels are pruned out based on the group structure of the representative matrix. In experiments, we applied our method to fully-connected networks, 1-dimensional CNNs, and 2-dimensional CNNs and compared with baseline deep neural networks in MNIST, CIFAR-10, and United States groundwater datasets with respect to the number of parameters and classification and regression accuracy. We show that our method can reduce the number of parameters significantly without loss of accuracy and outperform the baseline models.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133982166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ying Shen, Yang Deng, Jin Zhang, Yaliang Li, Nan Du, Wei Fan, Min Yang, Kai Lei
{"title":"IDDAT: An Ontology-Driven Decision Support System for Infectious Disease Diagnosis and Therapy","authors":"Ying Shen, Yang Deng, Jin Zhang, Yaliang Li, Nan Du, Wei Fan, Min Yang, Kai Lei","doi":"10.1109/ICDMW.2018.00201","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00201","url":null,"abstract":"Decision Support Systems (DSS) has become increasingly important due to its broad applications in various domains. Significant progresses have been made on ensuring more precise decision-making by leveraging appropriate data and knowledge from knowledge bases. However, the current DSSs related to antibiotics consider only therapy rather than diagnosis, and they were developed from a physician's perspective. Based on these two points, this study presents IDDAT, an ontology-driven decision support system for aiding Infectious Disease Diagnosis and Antibiotic Therapy. Based on patient-entered information, this freely accessible system aims to identify infectious disease, and provide an antibiotic therapy specifically adapted to the patient. We show the effectiveness of IDDAT by applying it to a diagnosis classification task. Experimental results reveal the system's advantages in term of the area under the curve (AUC) of receiver operating characteristic (ROC) (89.91%).","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130860526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online Vehicle Dispatch: from Assignment to Scheduling","authors":"Kangjia Zhao, Wenqing Chen, Kong-wei Lye","doi":"10.1109/ICDMW.2018.00094","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00094","url":null,"abstract":"The available prior demand data will make it possible for the ride hailing platform to make the central control strategies, which plan a sequence of trips for drivers in a certain future time period, so that a system optimal vehicle dispatch could be achieved. However, handling a large scale booking requests within the restrictive computing time to achieve an optimal vehicle dispatch is a big challenge. This paper proposes an optimization framework for the online vehicle dispatch problem by adopting vehicle scheduling methodology. A novel approach is introduced to solve the optimization challenges of the large problem size and the limited computing time. The designed optimization framework is validated by the real-world demand data in Singapore.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131181979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}