Qiwei Han, Mengxin Ji, Inigo Martinez de Rituerto de Troya, Manas Gaur, Leid Zejnilovic
{"title":"A Hybrid Recommender System for Patient-Doctor Matchmaking in Primary Care","authors":"Qiwei Han, Mengxin Ji, Inigo Martinez de Rituerto de Troya, Manas Gaur, Leid Zejnilovic","doi":"10.1109/DSAA.2018.00062","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00062","url":null,"abstract":"We partner with a leading European healthcare provider and design a mechanism to match patients with family doctors in primary care. We define the matchmaking process for several distinct use cases given different levels of available information about patients. Then, we adopt a hybrid recommender system to present each patient a list of family doctor recommendations. In particular, we model patient trust of family doctors using a large-scale dataset of consultation histories, while accounting for the temporal dynamics of their relationships. Our proposed approach shows higher predictive accuracy than both a heuristic baseline and a collaborative filtering approach, and the proposed trust measure further improves model performance.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121503373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maarten Bieshaar, Malte Depping, Jan Schneegans, B. Sick
{"title":"Starting Movement Detection of Cyclists Using Smart Devices","authors":"Maarten Bieshaar, Malte Depping, Jan Schneegans, B. Sick","doi":"10.1109/DSAA.2018.00042","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00042","url":null,"abstract":"In near future, vulnerable road users (VRUs) such as cyclists and pedestrians will be equipped with smart devices and wearables which are capable to communicate with intelligent vehicles and other traffic participants. Road users are then able to cooperate on different levels, such as in cooperative intention detection for advanced VRU protection. Smart devices can be used to detect intentions, e.g., an occluded cyclist intending to cross the road, to warn vehicles of VRUs, and prevent potential collisions. This article presents a human activity recognition approach to detect the starting movement of cyclists wearing smart devices. We propose a novel two-stage feature selection procedure using a score specialized for robust starting detection reducing the false positive detections and leading to understandable and interpretable features. The detection is modelled as a classification problem and realized by means of a machine learning classifier. We introduce an auxiliary class, that models starting movements and allows to integrate early movement indicators, i.e., body part movements indicating future behaviour. In this way we improve the robustness and reduce the detection time of the classifier. Our empirical studies with real-world data originating from experiments which involve 49 test subjects and consists of 84 starting motions show that we are able to detect the starting movements early. Our approach reaches an F1-score of 67 % within 0.33 s after the first movement of the bicycle wheel. Investigations concerning the device wearing location show that for devices worn in the trouser pocket the detector has less false detections and detects starting movements faster on average. % compared to reference detector involving all wearing locations. We found that we can further improve the results when we train distinct classifiers for different wearing locations. In this case we reach an F1-score of 94 % with a mean detection time of 0.34 s for the device worn in the trouser pocket.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122005780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Economic Value of Neighborhoods: Predicting Real Estate Prices from the Urban Environment","authors":"Marco De Nadai, B. Lepri","doi":"10.1109/DSAA.2018.00043","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00043","url":null,"abstract":"Housing costs have a significant impact on individuals, families, businesses, and governments. Recently, online companies such as Zillow have developed proprietary systems that provide automated estimates of housing prices without the immediate need of professional appraisers. Yet, our understanding of what drives the value of houses is very limited. In this paper, we use multiple sources of data to entangle the economic contribution of the neighborhood's characteristics such as walkability and security perception. We also develop and release a framework able to now-cast housing prices from Open data, without the need for historical transactions. Experiments involving 70,000 houses in 8 Italian cities highlight that the neighborhood's vitality and walkability seem to drive more than 20% of the housing value. Moreover, the use of this information improves the nowcast by 60%. Hence, the use of property's surroundings' characteristics can be an invaluable resource to appraise the economic and social value of houses after neighborhood changes and, potentially, anticipate gentrification.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133507006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. García-Martín, Niklas Lavesson, Håkan Grahn, E. Casalicchio, V. Boeva
{"title":"Hoeffding Trees with Nmin Adaptation","authors":"E. García-Martín, Niklas Lavesson, Håkan Grahn, E. Casalicchio, V. Boeva","doi":"10.1109/DSAA.2018.00017","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00017","url":null,"abstract":"Machine learning software accounts for a significant amount of energy consumed in data centers. These algorithms are usually optimized towards predictive performance, i.e. accuracy, and scalability. This is the case of data stream mining algorithms. Although these algorithms are adaptive to the incoming data, they have fixed parameters from the beginning of the execution. We have observed that having fixed parameters lead to unnecessary computations, thus making the algorithm energy inefficient. In this paper we present the nmin adaptation method for Hoeffding trees. This method adapts the value of the nmin parameter, which significantly affects the energy consumption of the algorithm. The method reduces unnecessary computations and memory accesses, thus reducing the energy, while the accuracy is only marginally affected. We experimentally compared VFDT (Very Fast Decision Tree, the first Hoeffding tree algorithm) and CVFDT (Concept-adapting VFDT) with the VFDT-nmin (VFDT with nmin adaptation). The results show that VFDT-nmin consumes up to 27% less energy than the standard VFDT, and up to 92% less energy than CVFDT, trading off a few percent of accuracy in a few datasets.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125521537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gradient Reversal against Discrimination: A Fair Neural Network Learning Approach","authors":"Edward Raff, Jared Sylvester","doi":"10.1109/DSAA.2018.00029","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00029","url":null,"abstract":"No methods currently exist for inducing fairness in arbitrary neural network architectures. In this work we introduce GRAD, a new and simplified method for producing fair neural networks that can be used for auto-encoding fair representations or directly with predictive networks. It is easy to implement and add to existing architectures, has only one (insensitive) hyper-parameter, and provides improved individual and group fairness. We use the flexibility of GRAD to demonstrate multi-attribute protection.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123087248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Duong Nguyen, Rodolphe Vadaine, G. Hajduch, R. Garello, Ronan Fablet
{"title":"A Multi-Task Deep Learning Architecture for Maritime Surveillance Using AIS Data Streams","authors":"Duong Nguyen, Rodolphe Vadaine, G. Hajduch, R. Garello, Ronan Fablet","doi":"10.1109/DSAA.2018.00044","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00044","url":null,"abstract":"In a world of global trading, maritime safety, security and efficiency are crucial issues. We propose a multi-task deep learning framework for vessel monitoring using Automatic Identification System (AIS) data streams. We combine recurrent neural networks with latent variable modeling and an embedding of AIS messages to a new representation space to jointly address key issues to be dealt with when considering AIS data streams: massive amount of streaming data, noisy data and irregular time-sampling. We demonstrate the relevance of the proposed deep learning framework on real AIS datasets for a three-task setting, namely trajectory reconstruction, anomaly detection and vessel type identification.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125839822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael A. Specter, Lalana Kagal
{"title":"Explaining Explanations: An Overview of Interpretability of Machine Learning","authors":"Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael A. Specter, Lalana Kagal","doi":"10.1109/DSAA.2018.00018","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00018","url":null,"abstract":"There has recently been a surge of work in explanatory artificial intelligence (XAI). This research area tackles the important problem that complex machines and algorithms often cannot provide insights into their behavior and thought processes. XAI allows users and parts of the internal system to be more transparent, providing explanations of their decisions in some level of detail. These explanations are important to ensure algorithmic fairness, identify potential bias/problems in the training data, and to ensure that the algorithms perform as expected. However, explanations produced by these systems is neither standardized nor systematically assessed. In an effort to create best practices and identify open challenges, we describe foundational concepts of explainability and show how they can be used to classify existing literature. We discuss why current approaches to explanatory methods especially for deep neural networks are insufficient. Finally, based on our survey, we conclude with suggested future research directions for explanatory artificial intelligence.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127841297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Forecasting Retweet Count during Elections Using Graph Convolution Neural Networks","authors":"Raghavendran Vijayan, G. Mohler","doi":"10.1109/DSAA.2018.00036","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00036","url":null,"abstract":"A retweet refers to sharing a tweet posted by another user on Twitter and is primary way information spreads on the Twitter network. Political parties use Twitter extensively as a part of their campaign to promote their presence, announce their propaganda, and at times debating with opponents. In this work we consider the problem of early prediction of the final retweet count using information from the network during the first several minutes after a post is made. Such predictions are useful for ranking and promoting posts and also can be used in combination with fake news detection. From a machine learning perspective, the task can be viewed as a regression problem. We introduce a novel graph convolution neural network for forecasting retweet count that combines network level features through graph convolution layers as well as tweet level features at a higher dense layer in the network. We first will provide an overview of the graph convolution network architecture and then perform several experiments on Twitter data collected during presidential elections in South Africa (2014) and Kenya (2013). We show that the model outperforms baseline models including a feed forward neural network and the popular point process based model SEISMIC.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131188995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Sabath, Q. Di, D. Braun, J. Schwartz, F. Dominici, C. Choirat
{"title":"Airpred: A Flexible R Package Implementing Methods for Predicting Air Pollution","authors":"M. Sabath, Q. Di, D. Braun, J. Schwartz, F. Dominici, C. Choirat","doi":"10.1109/DSAA.2018.00074","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00074","url":null,"abstract":"Large epidemiological studies have shown that exposure to air pollution, in particular fine particulate matter (PM2.5), is harmful to human health. However, air pollution monitors which measure air pollutant concentrations are sparsely located, excluding large portions of the population, in particular non-urban populations, from studies. One approach to resolving this issue has been developing models to predict local PM2.5, NO2, and ozone in unmonitored areas based on satellite, meteorological, and land use data. These prediction models are typically developed using large amounts of input data and are highly computationally intensive. We have developed a flexible R package that allows environmental health researchers to design and train spatio-temporal models capable of predicting multiple pollutants, including PM2.5. We utilize H2O, an open source big data R platform, to achieve both performance and scalability when used in conjunction with cloud or cluster computing systems.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126840064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning to Make Predictions on Graphs with Autoencoders","authors":"Phi Vu Tran","doi":"10.1109/DSAA.2018.00034","DOIUrl":"https://doi.org/10.1109/DSAA.2018.00034","url":null,"abstract":"We examine two fundamental tasks associated with graph representation learning: link prediction and semi-supervised node classification. We present a novel autoencoder architecture capable of learning a joint representation of both local graph structure and available node features for the multi-task learning of link prediction and node classification. Our autoencoder architecture is efficiently trained end-to-end in a single learning stage to simultaneously perform link prediction and node classification, whereas previous related methods require multiple training steps that are difficult to optimize. We provide a comprehensive empirical evaluation of our models on nine benchmark graph-structured datasets and demonstrate significant improvement over related methods for graph representation learning. Reference code and data are available at https://github.com/vuptran/graph-representation-learning.","PeriodicalId":208455,"journal":{"name":"2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123284478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}