2018 IEEE International Conference on Data Mining Workshops (ICDMW)最新文献

筛选
英文 中文
Effects of Negative Customer Reviews on Sales: Evidence Based on Text Data Mining 顾客负面评价对销售的影响:基于文本数据挖掘的证据
2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00124
Z. Li, Fangzhou Li, Jing Xiao, Zhi Yang
{"title":"Effects of Negative Customer Reviews on Sales: Evidence Based on Text Data Mining","authors":"Z. Li, Fangzhou Li, Jing Xiao, Zhi Yang","doi":"10.1109/ICDMW.2018.00124","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00124","url":null,"abstract":"The effects of online customer reviews on sales have been widely studied, especially with regard to Internet shopping and e-retailing. The majority of the literature, however, has focused on the volume and valence, and produced mixed results. For an in-depth understanding of the effects of customer reviews on sales, the present study uses text data mining to investigate how the text content of negative reviews impacts online sales. Content association and topic extraction were the methods used. Relevant data were collected from JD.com, and three factors - content topic, proportion, and consistency - were highlighted to unveil the underlying mechanism of effects of negative reviews on sales. The study results will enable marketers to understand the effects of customer reviews on Internet sales and help them to improve customer satisfaction and loyalty. Management implications and future research directions are also presented.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126659490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Robust Commuter Movement Inference from Connected Mobile Devices 基于连接移动设备的稳健通勤运动推断
2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00099
Baoyang Song, Hasan A. Poonawala, L. Wynter, Sebastien Blandin
{"title":"Robust Commuter Movement Inference from Connected Mobile Devices","authors":"Baoyang Song, Hasan A. Poonawala, L. Wynter, Sebastien Blandin","doi":"10.1109/ICDMW.2018.00099","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00099","url":null,"abstract":"The preponderance of connected devices provides unprecedented opportunities for fine-grained monitoring of the public infrastructure. However while classical models expect high quality application-specific data streams, the promise of the Internet of Things (IoT) is that of an abundance of disparate and noisy datasets from connected devices. In this context, we consider the problem of estimation of the level of service of a city-wide public transport network. We first propose a robust unsupervised model for train movement inference from wifi traces, via the application of robust clustering methods to a one dimensional spatio-temporal setting. We then explore the extent to which the demand-supply gap can be estimated from connected devices. We propose a classification model of real-time commuter patterns, including both a batch training phase and an online learning component. We describe our deployment architecture and assess our system accuracy on a large-scale anonymized dataset comprising more than 10 billion records.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"8 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114035992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Understanding Urban Spatio-Temporal Usage Patterns Using Matrix Tensor Factorization 利用矩阵张量分解理解城市时空使用模式
2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00216
Thirunavukarasu Balasubramaniam, R. Nayak, C. Yuen
{"title":"Understanding Urban Spatio-Temporal Usage Patterns Using Matrix Tensor Factorization","authors":"Thirunavukarasu Balasubramaniam, R. Nayak, C. Yuen","doi":"10.1109/ICDMW.2018.00216","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00216","url":null,"abstract":"Automated understanding of spatio-temporal usage patterns of real-world applications are significant in urban planning. With the capability of smartphones collecting various information using inbuilt sensors, the smart city data is enriched with multiple contexts. Whilst tensor factorization has been successfully used to capture latent factors (patterns) exhibited by the real-world datasets, the multifaceted nature of smart city data needs an improved modeling to utilize multiple contexts in sparse condition. Thus, in our ongoing research, we aim to model this data with a novel Context-Aware Nonnegative Coupled Sparse Matrix Tensor (CAN-CSMT) framework which imposes sparsity constraint to learn the true factors in sparse data. We also aim to develop a fast and efficient factorization algorithm to deal with the scalability problem persistent in the state-of-the-art factorization algorithms.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122934229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Large Database Schema Matching using Data Mining Techniques 基于数据挖掘技术的大型数据库模式匹配
2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00083
Debora G. Reis, M. Ladeira, M. Holanda, M. Victorino
{"title":"Large Database Schema Matching using Data Mining Techniques","authors":"Debora G. Reis, M. Ladeira, M. Holanda, M. Victorino","doi":"10.1109/ICDMW.2018.00083","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00083","url":null,"abstract":"With the expanding diversity of database technologies and database sizes, it is becoming increasingly hard to identify similar relational databases among many large databases stored in different Database Management Systems (DBMS). Therefore, we propose to use data mining techniques to automatically identify similar structures of relational databases by comparing their metadata, which is composed by physical details of the databases. The amount of metadata is proportional to the size of the schema structure. The possibilities of combinations for comparison is quadratic in relation to the number of schemas analyzed. Looking for the most efficient technique, we propose to calculate the schema similarity evaluating a distance of all the schemas to just one schema, which is a start point. Obviously schemas with close distances are more similar than schemas with bigger distances. We compare this proposal against two other approaches. The first approach compares all schemas against all another schemas except for its inverse comparison. The second approach compares schemas in a group of schemas with similar sizes. To validate our proposal, an experiment is performed with 354 real schemas ranging in sizes from 2 to 20 thousand metadata, totaling together more than 26 thousand tables and 238 thousand columns. Those schemas came from 5 different DBMS. The metadata extracted is transformed and formatted for comparing pairs of a schema. The textual features are compared using Cosine Distance and numerical features are compared using Euclidean Distance. Then, the hierarchical cluster technique is used to facilitate the visualization of the schema that most closely resembled one another. Results showed that, our was the most efficient because it compared all schema and identified the most similar schema by its structure in less than 2 minutes. The extracted metadata was used to create the first version of the metadata repository and an initial version of a data catalog, which contributed to the knowledge of existing data. Using this procedure, duplicated schemas were discovered and then discontinued, resulting in a cost savings of 10% of cost savings, while freeing up infrastructure resources. This solution is flexible, it supports a variety of schema sizes and DBMS.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131569197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Context Learning Network for Object Detection 用于对象检测的上下文学习网络
2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00103
Jiaxu Leng, Y. Liu, Tianlin Zhang, Pei Quan
{"title":"Context Learning Network for Object Detection","authors":"Jiaxu Leng, Y. Liu, Tianlin Zhang, Pei Quan","doi":"10.1109/ICDMW.2018.00103","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00103","url":null,"abstract":"Current state-of-the-art detectors typically classify candidate proposals using their interior features. However, the valuable contexts are not fully exploited by existing detectors yet, which limits the detection performance. In this paper, we present a context learning network (CLN), which aims to capture pairwise relation between objects and global contexts of each object. The proposed CLN consists of two subnetworks: a multi-layer perceptron (MLP) with three layers and a convolutional neural network (ConvNet) with two layers. The MLP is first designed to capture the pairwise relation context. Pairwise relationship context is then gathered and concatenated to further learn the global contexts by the ConvNet. Finally, we obtain the desired context feature maps with rich contextual information that are useful for accurate object detection. The proposed CLN is lightweight and it is easy to embed in any existing networks for object detection. In this paper, we present a context-aware Faster-RCNN with the proposed CLN and conduct extensive experiments to evaluate its performance. Experimental results demonstrate that the context-aware Faster-RCNN achieves state-of-the-art performance with the 82.1%, 80.7% and 38.4%mAPs on VOC 2007, VOC 2012 and COCO datasets, respectively.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133500225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Exploring the Effect of Household Structure in Historical Record Linkage of Early 1900s Ireland Census Records 20世纪初爱尔兰人口普查记录中家庭结构在历史记录联动中的作用探讨
2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00080
K. Frisoli, Rebecca Nugent
{"title":"Exploring the Effect of Household Structure in Historical Record Linkage of Early 1900s Ireland Census Records","authors":"K. Frisoli, Rebecca Nugent","doi":"10.1109/ICDMW.2018.00080","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00080","url":null,"abstract":"Record linkage is the process of identifying records corresponding to unique entities across datasets. Linking historical data allows researchers to better characterize topics like population mobility, impacts of local/national events, and generational changes. Most record linkage algorithms rely on string similarities (e.g. edit distance of name); however sometimes we expect to see changes not captured by standard text similarity metrics (e.g. name changes after marriage). The recently available Ireland 1901, 1911 national census records have limited, non-standardized fields containing the typical errors associated with digitizing and formatting hand-written records. These issues, coupled with high frequencies of common names, are part of the reasons traditional methods struggle. These methods often only consider pairwise information without incorporating household or relationship information across records (e.g. parents, siblings). However, the original census records correspond to households which allows us to explore incorporating additional structure into traditional record linkage methods. In this paper, we describe an initial labeling procedure for a subset of County Carlow, Ireland and compare approaches for including household information into both supervised and unsupervised record linkage techniques.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133378092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Stock Market Prediction Analysis by Incorporating Social and News Opinion and Sentiment 结合社会和新闻意见和情绪的股票市场预测分析
2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00195
Zhaoxia Wang, Seng-Beng Ho, Zhiping Lin
{"title":"Stock Market Prediction Analysis by Incorporating Social and News Opinion and Sentiment","authors":"Zhaoxia Wang, Seng-Beng Ho, Zhiping Lin","doi":"10.1109/ICDMW.2018.00195","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00195","url":null,"abstract":"The price of the stocks is an important indicator for a company and many factors can affect their values. Different events may affect public sentiments and emotions differently, which may have an effect on the trend of stock market prices. Because of dependency on various factors, the stock prices are not static, but are instead dynamic, highly noisy and nonlinear time series data. Due to its great learning capability for solving the nonlinear time series prediction problems, machine learning has been applied to this research area. Learning-based methods for stock price prediction are very popular and a lot of enhanced strategies have been used to improve the performance of the learning based predictors. However, performing successful stock market prediction is still a challenge. News articles and social media data are also very useful and important in financial prediction, but currently no good method exists that can take these social media into consideration to provide better analysis of the financial market. This paper aims to successfully predict stock price through analyzing the relationship between the stock price and the news sentiments. A novel enhanced learning-based method for stock price prediction is proposed that considers the effect of news sentiments. Compared with existing learning-based methods, the effectiveness of this new enhanced learning-based method is demonstrated by using the real stock price data set with an improvement of performance in terms of reducing the Mean Square Error (MSE). The research work and findings of this paper not only demonstrate the merits of the proposed method, but also points out the correct direction for future work in this area.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"135 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114017711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
NetDriller Version 2: A Powerful Social Network Analysis Tool NetDriller版本2:一个强大的社会网络分析工具
2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00211
Salim Afra, Tansel Özyer, J. Rokne
{"title":"NetDriller Version 2: A Powerful Social Network Analysis Tool","authors":"Salim Afra, Tansel Özyer, J. Rokne","doi":"10.1109/ICDMW.2018.00211","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00211","url":null,"abstract":"Social network analysis has gained considerable attention since Web 2.0 emerged and provided the ground for two-ways interaction platforms. The immediate outcome is the availability of raw datasets which reflect social interactions between various entities. Indeed, social networking platforms and other communication devices are producing huge amounts of data which form valuable sources for knowledge discovery. Hence the need for automated tools like NetDriller capable of successfully maximizing the benefit from networked data. Most datasets which reflect kind of many to many relationship can be represented as a network which is a graph consisting of actors having relationships among each other. Many tools exist for network analysis inspired to extract knowledge from a constructed network. However, most of these tools require users to prepare as input a dataset that inspires the complete network which is then displayed and analyzed by the tool using the measures supported. A different perspective has been employed to develop NetDriller as a network construction and analysis tool which does some tasks beyond what is normally available in existing tools. NetDriller covers the lack that exists in other tools by constructing a network from raw data using data mining techniques. In this paper, we describe the second version of NetDriller which has been recently improved by adding new functions for a richer and more effective network construction and analysis. This keeps the tool up to date and with high potential to handle the huge volume of networks and the different types of raw data available for analysis.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114484984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Python Source Code De-anonymization Using Nested Bigrams 使用嵌套双元的Python源代码去匿名化
2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00011
Pegah Hozhabrierdi, D. Hitos, C. Mohan
{"title":"Python Source Code De-anonymization Using Nested Bigrams","authors":"Pegah Hozhabrierdi, D. Hitos, C. Mohan","doi":"10.1109/ICDMW.2018.00011","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00011","url":null,"abstract":"An important issue in cybersecurity is the insertion or modification of code by individuals other than the original authors of the code. This motivates research on authorship attribution of unknown source code. We have addressed the deficiencies of previously used feature extraction methods and propose a novel approach: Nested Bigrams. Such features are easy to extract and carry substantial information about the interconnections between the nodes of the abstract syntax tree. We also show that for large number of authors, a Strongly Regularized Feed-forward Neural Network outperforms the Random Forest Classifier used in many code stylometric studies. A new ranking system for reducing the number of features is also proposed, and experiments show that this approach can reduce the feature set to 98 nested bigrams while maintaining a classification accuracy above 90 percent.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122565974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Requirements Definition with Extended Goal Graph 扩展目标图的需求定义
2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI: 10.1109/ICDMW.2018.00039
Masafumi Ifuku, N. Kushiro, Yusuke Aoyama
{"title":"Requirements Definition with Extended Goal Graph","authors":"Masafumi Ifuku, N. Kushiro, Yusuke Aoyama","doi":"10.1109/ICDMW.2018.00039","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00039","url":null,"abstract":"Significant requirements are often discovered during discussion about tradeoffs and conflicts between stakeholders in requirements meeting. Developing a method to handle tradeoffs and conflicts becomes a breakthrough to acquire significant requirements which are difficult to elicit for requirements analysts. In this paper, the Extended Goal Graph (EGG) is proposed as a method for handling tradeoffs and conflicts by providing traceability between requirements analysis and system design. We developed the EGG system to support requirement definition process with the EGG. The system was applied to the requirement definition meeting among medical doctors and potential patients for selecting proper inspections required for diagnosing disease when shadows on lungs were found.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125135241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信