2018 IEEE International Conference on Data Mining Workshops (ICDMW)最新文献_第3页

Effects of Negative Customer Reviews on Sales: Evidence Based on Text Data Mining 顾客负面评价对销售的影响:基于文本数据挖掘的证据

2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00124

Z. Li, Fangzhou Li, Jing Xiao, Zhi Yang

引用次数: 4

Robust Commuter Movement Inference from Connected Mobile Devices 基于连接移动设备的稳健通勤运动推断

2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00099

Baoyang Song, Hasan A. Poonawala, L. Wynter, Sebastien Blandin

引用次数: 1

Understanding Urban Spatio-Temporal Usage Patterns Using Matrix Tensor Factorization 利用矩阵张量分解理解城市时空使用模式

2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00216

Thirunavukarasu Balasubramaniam, R. Nayak, C. Yuen

引用次数: 4

Large Database Schema Matching using Data Mining Techniques 基于数据挖掘技术的大型数据库模式匹配

2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00083

Debora G. Reis, M. Ladeira, M. Holanda, M. Victorino

{"title":"Large Database Schema Matching using Data Mining Techniques","authors":"Debora G. Reis, M. Ladeira, M. Holanda, M. Victorino","doi":"10.1109/ICDMW.2018.00083","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00083","url":null,"abstract":"With the expanding diversity of database technologies and database sizes, it is becoming increasingly hard to identify similar relational databases among many large databases stored in different Database Management Systems (DBMS). Therefore, we propose to use data mining techniques to automatically identify similar structures of relational databases by comparing their metadata, which is composed by physical details of the databases. The amount of metadata is proportional to the size of the schema structure. The possibilities of combinations for comparison is quadratic in relation to the number of schemas analyzed. Looking for the most efficient technique, we propose to calculate the schema similarity evaluating a distance of all the schemas to just one schema, which is a start point. Obviously schemas with close distances are more similar than schemas with bigger distances. We compare this proposal against two other approaches. The first approach compares all schemas against all another schemas except for its inverse comparison. The second approach compares schemas in a group of schemas with similar sizes. To validate our proposal, an experiment is performed with 354 real schemas ranging in sizes from 2 to 20 thousand metadata, totaling together more than 26 thousand tables and 238 thousand columns. Those schemas came from 5 different DBMS. The metadata extracted is transformed and formatted for comparing pairs of a schema. The textual features are compared using Cosine Distance and numerical features are compared using Euclidean Distance. Then, the hierarchical cluster technique is used to facilitate the visualization of the schema that most closely resembled one another. Results showed that, our was the most efficient because it compared all schema and identified the most similar schema by its structure in less than 2 minutes. The extracted metadata was used to create the first version of the metadata repository and an initial version of a data catalog, which contributed to the knowledge of existing data. Using this procedure, duplicated schemas were discovered and then discontinued, resulting in a cost savings of 10% of cost savings, while freeing up infrastructure resources. This solution is flexible, it supports a variety of schema sizes and DBMS.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131569197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Context Learning Network for Object Detection 用于对象检测的上下文学习网络

2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00103

Jiaxu Leng, Y. Liu, Tianlin Zhang, Pei Quan

{"title":"Context Learning Network for Object Detection","authors":"Jiaxu Leng, Y. Liu, Tianlin Zhang, Pei Quan","doi":"10.1109/ICDMW.2018.00103","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00103","url":null,"abstract":"Current state-of-the-art detectors typically classify candidate proposals using their interior features. However, the valuable contexts are not fully exploited by existing detectors yet, which limits the detection performance. In this paper, we present a context learning network (CLN), which aims to capture pairwise relation between objects and global contexts of each object. The proposed CLN consists of two subnetworks: a multi-layer perceptron (MLP) with three layers and a convolutional neural network (ConvNet) with two layers. The MLP is first designed to capture the pairwise relation context. Pairwise relationship context is then gathered and concatenated to further learn the global contexts by the ConvNet. Finally, we obtain the desired context feature maps with rich contextual information that are useful for accurate object detection. The proposed CLN is lightweight and it is easy to embed in any existing networks for object detection. In this paper, we present a context-aware Faster-RCNN with the proposed CLN and conduct extensive experiments to evaluate its performance. Experimental results demonstrate that the context-aware Faster-RCNN achieves state-of-the-art performance with the 82.1%, 80.7% and 38.4%mAPs on VOC 2007, VOC 2012 and COCO datasets, respectively.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133500225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Exploring the Effect of Household Structure in Historical Record Linkage of Early 1900s Ireland Census Records 20世纪初爱尔兰人口普查记录中家庭结构在历史记录联动中的作用探讨

2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00080

K. Frisoli, Rebecca Nugent

{"title":"Exploring the Effect of Household Structure in Historical Record Linkage of Early 1900s Ireland Census Records","authors":"K. Frisoli, Rebecca Nugent","doi":"10.1109/ICDMW.2018.00080","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00080","url":null,"abstract":"Record linkage is the process of identifying records corresponding to unique entities across datasets. Linking historical data allows researchers to better characterize topics like population mobility, impacts of local/national events, and generational changes. Most record linkage algorithms rely on string similarities (e.g. edit distance of name); however sometimes we expect to see changes not captured by standard text similarity metrics (e.g. name changes after marriage). The recently available Ireland 1901, 1911 national census records have limited, non-standardized fields containing the typical errors associated with digitizing and formatting hand-written records. These issues, coupled with high frequencies of common names, are part of the reasons traditional methods struggle. These methods often only consider pairwise information without incorporating household or relationship information across records (e.g. parents, siblings). However, the original census records correspond to households which allows us to explore incorporating additional structure into traditional record linkage methods. In this paper, we describe an initial labeling procedure for a subset of County Carlow, Ireland and compare approaches for including household information into both supervised and unsupervised record linkage techniques.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133378092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Stock Market Prediction Analysis by Incorporating Social and News Opinion and Sentiment 结合社会和新闻意见和情绪的股票市场预测分析

2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00195

Zhaoxia Wang, Seng-Beng Ho, Zhiping Lin

{"title":"Stock Market Prediction Analysis by Incorporating Social and News Opinion and Sentiment","authors":"Zhaoxia Wang, Seng-Beng Ho, Zhiping Lin","doi":"10.1109/ICDMW.2018.00195","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00195","url":null,"abstract":"The price of the stocks is an important indicator for a company and many factors can affect their values. Different events may affect public sentiments and emotions differently, which may have an effect on the trend of stock market prices. Because of dependency on various factors, the stock prices are not static, but are instead dynamic, highly noisy and nonlinear time series data. Due to its great learning capability for solving the nonlinear time series prediction problems, machine learning has been applied to this research area. Learning-based methods for stock price prediction are very popular and a lot of enhanced strategies have been used to improve the performance of the learning based predictors. However, performing successful stock market prediction is still a challenge. News articles and social media data are also very useful and important in financial prediction, but currently no good method exists that can take these social media into consideration to provide better analysis of the financial market. This paper aims to successfully predict stock price through analyzing the relationship between the stock price and the news sentiments. A novel enhanced learning-based method for stock price prediction is proposed that considers the effect of news sentiments. Compared with existing learning-based methods, the effectiveness of this new enhanced learning-based method is demonstrated by using the real stock price data set with an improvement of performance in terms of reducing the Mean Square Error (MSE). The research work and findings of this paper not only demonstrate the merits of the proposed method, but also points out the correct direction for future work in this area.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"135 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114017711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

NetDriller Version 2: A Powerful Social Network Analysis Tool NetDriller版本2:一个强大的社会网络分析工具

2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00211

Salim Afra, Tansel Özyer, J. Rokne

{"title":"NetDriller Version 2: A Powerful Social Network Analysis Tool","authors":"Salim Afra, Tansel Özyer, J. Rokne","doi":"10.1109/ICDMW.2018.00211","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00211","url":null,"abstract":"Social network analysis has gained considerable attention since Web 2.0 emerged and provided the ground for two-ways interaction platforms. The immediate outcome is the availability of raw datasets which reflect social interactions between various entities. Indeed, social networking platforms and other communication devices are producing huge amounts of data which form valuable sources for knowledge discovery. Hence the need for automated tools like NetDriller capable of successfully maximizing the benefit from networked data. Most datasets which reflect kind of many to many relationship can be represented as a network which is a graph consisting of actors having relationships among each other. Many tools exist for network analysis inspired to extract knowledge from a constructed network. However, most of these tools require users to prepare as input a dataset that inspires the complete network which is then displayed and analyzed by the tool using the measures supported. A different perspective has been employed to develop NetDriller as a network construction and analysis tool which does some tasks beyond what is normally available in existing tools. NetDriller covers the lack that exists in other tools by constructing a network from raw data using data mining techniques. In this paper, we describe the second version of NetDriller which has been recently improved by adding new functions for a richer and more effective network construction and analysis. This keeps the tool up to date and with high potential to handle the huge volume of networks and the different types of raw data available for analysis.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114484984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Python Source Code De-anonymization Using Nested Bigrams 使用嵌套双元的Python源代码去匿名化

2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00011

Pegah Hozhabrierdi, D. Hitos, C. Mohan

引用次数: 3

Requirements Definition with Extended Goal Graph 扩展目标图的需求定义

2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00039

Masafumi Ifuku, N. Kushiro, Yusuke Aoyama

引用次数: 2