{"title":"A Comparative Study of Urban Mobility Patterns Using Large-Scale Spatio-Temporal Data","authors":"The Anh Dang, Jodi Chiam, Y. Li","doi":"10.1109/ICDMW.2018.00089","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00089","url":null,"abstract":"The large scale spatio-temporal data brought about by the ubiquitous wireless networks, mobile phones, and GPS devices present a fertile ground for studying human mobility. These data sources come with high coverage and resolution that enable studies of mobility patterns for human populations at large that other conventional methods such as surveys are not capable of. In this paper, we study anonymized spatio-temporal data from telco networks to understand the variability in human mobility behavior across different geographical regions. We present methodologies for extracting trips and other mobility features from large-scale spatio-temporal data. We also look into daily activity patterns of the populations in two specific cities, Singapore and Sydney. Our results include measures of distance and frequency of people's travel, as well as their purpose of travel, mode of transport, and route choice. We extract mobility patterns known as motifs. We also define a mobility index to assess the mobility level of individuals and compare it among different regions and demographic groups. This work contributes to a more comprehensive understanding of urban dynamics, supporting smart city development and sustainable urbanization.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132729233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marina Pominova, Alexey Artemov, M. Sharaev, E. Kondrateva, A. Bernstein, Evgeny Burnaev
{"title":"Voxelwise 3D Convolutional and Recurrent Neural Networks for Epilepsy and Depression Diagnostics from Structural and Functional MRI Data","authors":"Marina Pominova, Alexey Artemov, M. Sharaev, E. Kondrateva, A. Bernstein, Evgeny Burnaev","doi":"10.1109/ICDMW.2018.00050","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00050","url":null,"abstract":"In the field of psychoneurology, analysis of neuroimaging data aimed at extracting distinctive patterns of pathologies, such as epilepsy and depression, is well known to represent a challenging problem. As the resolution and acquisition rates of modern medical scanners rise, the need to automatically capture complex spatiotemporal patterns in large imaging arrays suggests using automated approaches to pattern recognition in volumetric images, such as training a classification models using deep learning. On the other hand, with typically scarce training data, the choice of a particular neural network architecture remains an unresolved issue. In this work, we evaluate off-the-shelf building blocks of deep voxelwise neural architectures with the goal of learning robust decision rules in computational psychiatry. To this end, we carry out a series of computational experiments, aiming at the recognition of epilepsy and depression on structural (3D) and functional (4D) MRI data. We discover that our investigated models perform on par with computational approaches known in literature, without the need for sophisticated preprocessing and feature extraction.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128335085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph Regularized Symmetric Non-Negative Matrix Factorization for Graph Clustering","authors":"Ziheng Gao, Naiyang Guan, Longfei Su","doi":"10.1109/ICDMW.2018.00062","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00062","url":null,"abstract":"Symmetric non-negative matrix factorization (Sym-NMF) decomposes a high-dimensional symmetric non-negative matrix into a low-dimensional non-negative matrix and has been successfully used in graph clustering. In this paper, we propose a graph regularized symmetric non-negative matrix factorization (GrSymNMF) to enhance its performance in graph clustering. Particularly, GrSymNMF encodes the geometric structure so that the nearby points remain close to each other in the clustering domain. We optimize GrSymNMF by using a greedy coordinate descent algorithm and provide a distributed computing strategy to deploy GrSymNMF to large-scale datasets because it requires few communication overheads among computing nodes. The experiments on complex graph datasets and text corpus datasets verify the performance of GrSymNMF and efficiency, scalability and effectiveness of the distributed strategy of GrSymNMF.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131648946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multivariate Arrival Times with Recurrent Neural Networks for Personalized Demand Forecasting","authors":"Tianle Chen, Brian Keng, Javier Moreno","doi":"10.1109/ICDMW.2018.00121","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00121","url":null,"abstract":"Access to a large variety of data across a massive population has made it possible to predict customer purchase patterns and responses to marketing campaigns. In particular, accurate demand forecasts for popular products with frequent repeat purchases are essential since these products are one of the main drivers of profits. However, buyer purchase patterns are extremely diverse and sparse on a per-product level due to population heterogeneity as well as dependence in purchase patterns across product categories. Traditional methods in survival analysis have proven effective in dealing with censored data by assuming parametric distributions on inter-arrival times. Distributional parameters are then fitted, typically in a regression framework. On the other hand, neural-network based models take a non-parametric approach to learn relations from a larger functional class. However, the lack of distributional assumptions make it difficult to model partially observed data. In this paper, we model directly the inter-arrival times as well as the partially observed information at each time step in a survival-based approach using Recurrent Neural Networks (RNN) to model purchase times jointly over several products. Instead of predicting a point estimate for inter-arrival times, the RNN outputs parameters that define a distributional estimate. The loss function is the negative log-likelihood of these parameters given partially observed data. This approach allows one to leverage both fully observed data as well as partial information. By externalizing the censoring problem through a log-likelihood loss function, we show that substantial improvements over state-of-the-art machine learning methods can be achieved. We present experimental results based on two open datasets as well as a study on a real dataset from a large retailer.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132132239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enrique Noriega-Atala, P. Hein, S. S. Thumsi, Zechy Wong, Xia Wang, C. Morrison
{"title":"Inter-Sentence Relation Extraction for Associating Biological Context with Events in Biomedical Texts","authors":"Enrique Noriega-Atala, P. Hein, S. S. Thumsi, Zechy Wong, Xia Wang, C. Morrison","doi":"10.1109/ICDMW.2018.00110","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00110","url":null,"abstract":"We present an analysis of the problem of identifying biological context and associating it with biochemical events in biomedical texts. This constitutes a non-trivial, inter-sentential relation extraction task. We focus on biological context as descriptions of the species, tissue type and cell type that are associated with biochemical events. We describe the properties of an annotated corpus of context-event relations and present and evaluate several classifiers for context-event association trained on syntactic, distance and frequency features.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"05 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134526662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Outlier Detection Based on Low Density Models","authors":"Félix Iglesias, T. Zseby, A. Zimek","doi":"10.1109/ICDMW.2018.00140","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00140","url":null,"abstract":"Most outlier detection algorithms are based on lazy learning or imply quadratic complexity. Both characteristics make them unsuitable for big data and stream data applications and preclude their applicability in systems that must operate autonomously. In this paper we propose a new algorithm—called SDO (Sparse Data Observers)—to estimate outlierness based on low density models of data. SDO is an eager learner; therefore, computational costs in application phases are severely reduced. We perform tests with a wide variation of synthetic datasets as well as the main datasets published in the literature for anomaly detection testing. Results show that SDO satisfactorily competes with the best ranked outlier detection alternatives. The good detection performance coupled with a low complexity makes SDO highly flexible and adaptable to stand-alone frameworks that must detect outliers fast with accuracy rates equivalent to lazy learning algorithms.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134570061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Decentralized Approach for Negative Link Prediction in Large Graphs","authors":"F. Abbasi, M. Muzammal, Qiang Qu","doi":"10.1109/ICDMW.2018.00027","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00027","url":null,"abstract":"Social network analytics is an important research area and attracts a lot of attention from researchers. Extraction of meaningful information from linked structures such as graph is known as link analysis. The emergence of signed social networks gives interesting insights into the social networks as the signed networks have the ability to represent various real-world relationships with positive (friend) and negative (foe) links. One interesting issue in signed networks is edge sign prediction among the members of the network. Negative link prediction is challenging due to the limited availability of the training data and also due to extracting a graph embedding that represents the negative links in a sparse graph. This study is focused on the prediction of the negative links across the signed network using a decentralized approach. For learning latent factors across the network, we use probabilistic matrix factorization. A detailed experimental study is performed to evaluate the accuracy of the proposed model. The results show that negative link prediction using matrix factorization is a promising approach and negative links can be predicted with high accuracy.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123016275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul Tsao, Tsì-Uí İk, Guan-Wen Chen, Wen-Chih Peng
{"title":"Stitching Aerial Images for Vehicle Positioning and Tracking","authors":"Paul Tsao, Tsì-Uí İk, Guan-Wen Chen, Wen-Chih Peng","doi":"10.1109/ICDMW.2018.00096","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00096","url":null,"abstract":"In recent years, applications such as reproduction of traffic accident scene, calculation of traffic flow, etc. use UAVs to collect data by aerial images. If a UAV does not move, the coordinates of objects in images can be considered as reference for predicting parameters such as distance and direction. How-ever, this does not apply if the UAV is moving. In this research, methods of image processing are used to obtain the pinhole model which is similar to perpendicularly taken images, which should be distance-preserving. Then, the method of image stitching is used to find out the correlation of relative position between each image in order to construct a global coordinates system over every images. The common used methods of image stitching are not considered about distance-preserving, and cause the cumulative error usually. In this case, GPS information is used to estimate the starting position of images, then stitching those images to a panorama through applying SIFT feature pair and fine-tuning by the gradient method. At last, this paper integrates methods as mentioned into implementation and presents visualized data, then both the trajectory of UAV and the relative position of objects between images and the panorama can be observed by user in order to verify feasibility and precision of methods. Also the results of this paper can be applied with traffic flows detection in the future.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123987342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Pires, M. Gromicho, S. Pinto, M. Carvalho, S. Madeira
{"title":"Predicting Non-invasive Ventilation in ALS Patients Using Stratified Disease Progression Groups","authors":"S. Pires, M. Gromicho, S. Pinto, M. Carvalho, S. Madeira","doi":"10.1109/ICDMW.2018.00113","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00113","url":null,"abstract":"Amyotrophic Lateral Sclerosis (ALS) is a neurode-generative disease highly known for its rapid progression, leading to death usually within a few years. Respiratory failure is the most common cause of death. Therefore, efforts must be taken to prevent respiratory insufficiency. Preventive administration of non-invasive ventilation (NIV) has proven to improve survival in ALS patients. Using disease progression groups revealed to be of great importance to ALS studies, since the heterogeneous nature of disease presentation and progression presents challenges to the learn of predictive models that work for all patients. In this context, we propose an approach to stratify patients in three progression groups (Slow, Neutral and Fast) enabling the creation of specialized learning models that predict the need of NIV within a time window of 90, 180 or 365 days of their current medical appointment. The models are built using a collection of classifiers and 5x10-fold cross validation. We also test the use of a Feature Selection Ensemble to test which features are more relevant to predict this outcome. Our specialized predictive models showed promising results, proving the utility of patient stratification when predicting NIV in ALS patients.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127850904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Song, Minghui Li, Qi He, Dongmei Huang, C. Perra, A. Liotta
{"title":"A Residual Convolution Neural Network for Sea Ice Classification with Sentinel-1 SAR Imagery","authors":"Wei Song, Minghui Li, Qi He, Dongmei Huang, C. Perra, A. Liotta","doi":"10.1109/ICDMW.2018.00119","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00119","url":null,"abstract":"Sea ice type classification is critically important for sea ice monitoring, and synthetic aperture radar (SAR) has become the main data source for sea ice classification. With a large number of SAR images produced every day, a more intelligent sea ice classification process is urgently needed. In this paper, we constructed a four-type sea ice classification dataset using Sentinel-1 SAR images with the reference of Canadian Ice Service’s ice charts and designed a residual convolution network for sea ice classification: Sea Ice Residual Convolutional Network (SI-Resnet). We further designed a multi-model average scoring strategy with the idea of ensemble learning to improve the classification accuracy between closely-associated ice types. Based on the experiments, our proposed method outperformed MLP, AlexNet, and traditional SVM methods, reaching the overall accuracy of 94% and Kappa coefficient of 91.9. For the evaluation on regional ice concentration, the values computed from the SI-Resnet’s classification results are more consistent with ice chart’s regional concentration data than those of MLP, AlexNet and SVM. Compared with the manually generated ice chart of CIS, our method can work automatically and provide more detailed ice distribution to a useful reference for ship route planning and sea ice changes monitoring.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128488777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}